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4.  STATEMENT  OF  THE  PROBLEM  STUDIED 

In  the  last  decade,  MMICs  have  played  a  leading  role  in  the  design  of  microwave 
communications  systems.  As  the  fabrication  technologies  of  these  devices  have  matured,  both 
military  and  commercial  applications  have  demanded  smaller  and  denser  MMICs  operating  over 
larger  bandwidths  and  at  higher  operating  frequencies  (into  the  Kq  and  U  bands).  This  has  lead  to 
an  increased  need  for  more  rigorous  analyses  that  explicitly  model  the  electromagnetic  interactions 
within  these  devices.  Although,  it  is  instmctive  to  isolate  each  element  of  the  MMIC  and  analyze 
them  independently,  proper  characterization  of  densely  packaged  circuits  require  the  rigorous 
analysis  of  entire  circuits  simultaneously,  or  large  blocks  of  circuits.  Specifically,  electromagnetic 
effects  such  as  coupling,  radiation,  surface  waves,  and  other  affects  that  lead  to  induced  signals  on 
neighboring  signal  lines  need  be  accounted  for  to  accurately  characterize  the  circuit  devices. 

The  objective  of  this  research  has  been  to  develop  a  MMIC  circuit  simulator  based  on  direct 
time-domain  solutions  of  Maxwell’s  equations.  The  simulator  is  capable  of  analyzing  MMICs 
using  either  the  traditional  finite-difference  time-domain  (FDTD)  method,  an  explicit  planar 
generalized  Yee-algorithm  (PGY)  based  on  generalized  unstructured  grids,  or  the  implicit  Finite 
Element  Time-Domain  (FETD)  method.  The  FDTD  method  is  a  highly  robust  and  efficient 
computational  method  that  is  well  suited  for  circuits  whose  geometries  are  separable  in  a  Cartesian 
coordinate  system.  The  PGY  and  the  FETD  methods,  which  are  based  on  volume  discretizations 
using  unstructured  grids,  are  much  better  suited  for  analyzing  circuits  with  complex  geometries. 
The  FETD  offers  a  higher-order  approximation  of  the  fields.  Furthermore,  it  can  be  posed  in  a 
form  that  is  unconditionally  stable.  As  a  result,  the  time  step  is  not  restricted  by  a  Courant  limit. 
Unfortunately,  in  many  applications  it  is  still  more  computationally  intensive  than  the  PGY  method 
since  it  requires  a  sparse  matrix  solution  required  at  each  time  iteration. 

In  the  course  of  the  development  of  these  algorithms,  it  was  found  that  most  non-orthogonal 
FDTD  methods,  including  locally-deformed  FDTD  methods,  non-orthogonal  FDTD  methods,  and 
the  PGY  method,  suffer  from  late  time  instability.  Typically,  these  late  time-instabilities  will 
corrupt  the  analysis  of  large  resonant  structures  requiring  large  numbers  of  time  steps.  Upon 
investigating,  it  was  found  that  these  algorithms  are  not  well  posed.  Subsequently,  a  sufficient  test 
for  well-posedness  was  developed.  Secondly,  methods  were  introduced  that  rendered  these 
algorithms  to  be  well  posed  were  developed. 

A  second  objective  of  the  proposed  research  has  been  to  develop  efficient  high-performance 
parallel  algorithms  for  the  implicit  and  explicit  time-domain  methods.  Due  to  the  characteristics  of 
volume  discretizations,  the  FDTD,  PGY,  and  FETD  methods  can  be  extremely  computationally 
intensive  when  solving  large  resonant  structures.  It  has  been  shown  that  these  algorithms  have 
high  degrees  of  parallelism,  and  very  large  problems  can  be  efficiently  analyzed  on  high- 
performance  parallel  algorithms.  Explicit  schemes,  such  as  the  FDTD  and  PGY  methods,  are  most 
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efficiently  parallelized  using  a  divide  and  conquer  scheme  based  on  a  spatial  decomposition  of  the 
volume  mesh.  On  the  other  hand,  a  divide  and  conquer  approach,  such  as  a  spatial  decomposition 
in  conjunction  with  a  parallel  iterative  linear  system  solution  such  as  the  Conjugate  Gradient  (CG) 
method,  for  implicit  schemes  such  as  the  FETD  is  not  necessarily  optimal.  This  is  because  as  the 
problem  size  increases,  the  condition  number  of  the  sparse  matrix  comensurately  increases. 
Subsequently,  the  number  of  iterations  required  to  converge  at  each  time  step  grows  with  the 
problem  size.  A  more  efficient  approach  has  been  found  that  is  based  on  the  method  of  Lagrange 
multipliers.  Specifically,  a  spatial  decomposition  is  performed  on  the  global  mesh.  Each 
subregion  is  then  mapped  to  a  processor  and  solved  using  a  sparse  direct  solver  locally. 
Constraints  enforcing  the  continuity  of  the  tangential  fields  are  impressed  across  each  interface 
using  Lagrange  multipliers,  leading  to  a  global  linear  system  of  equations.  This  method  has  been 
referred  to  as  the  Finite-Element  Tearing  and  Interconnecting  (FETI)  algorithm.  It  has  been  found 
that  this  method  actually  results  in  super-linear  speedups  for  the  FETD  algorithm  due  to  the  nature 
of  the  condition  number  of  the  linear  system. 

A  final  objective  of  the  research  has  been  to  develop  the  capability  of  analyzing  both  linear  and 
non-linear  circuits.  To  this  end,  lumped  circuit  models  have  been  introduced  into  the  FDTD  and 
PGY  algorithms  using  a  state  variable  technique.  The  state  variable  method  allows  for  the 
inclusion  of  linear  and  non-linear  devices  to  be  directly  incorporated  into  the  FDTD  and  PGY 
simulations  in  a  manner  that  is  device  independent.  Specifically,  a  device  library  can  be  created 
such  that  the  FDTD  algorithm  is  independent  of  the  device.  Rather,  the  port  voltages  or  currents 
are  provided  to  the  generic  State  variable  model,  and  the  output  port  currents  or  voltages  are  then 
returned  to  the  FDTD  method  and  then  applied  to  the  local  fields. 

Other  subsequent  venues  have  been  pursued  through  the  course  of  this  research  that  have 
spawned  from  the  development  of  the  above  algorithms.  Initially,  one  limitation  of  finite  methods 
;  such  as  FDTD  or  FETD  is  when  solving  unbounded  problems,  a  mesh  truncation  condition  must 
be  introduced  which  efficiently  and  accurately  terminates  (or  absorbers)  all  radiation  incident  upon 
an  exterior  boundary.  For  many  applications,  sqch  as  computing  the  DC  bias  of  a  non-linear  FET 
device,  traditional  pseudo-differential  based  absorbing  boundaries  introduce  significant  error. 
Subsequently,  alternative  absorbing  boundary  conditions  were  pursued.  This  lead  to  the 
development  of  an  anisotropic  perfectly  matched  layer  (PML)  medium  for  the  truncation  of  FDTD 
lattices.  This  method,  which  is  very  similar  in  characteristic  to  Berenger's  split-field  PML,  is 
based  on  a  physical  interpretation  of  the  perfectly  matched  layer  medium  as  a  uniaxial  medium 
rather  than  a  mathematical  medium.  This  has  lead  to  the  immediate  extension  of  the  PML  to 
inhomogeneous  media,  lossy  media,  dispersive  media,  and  non-linear  media. 

Another  development  that  has  stemmed  from  this  research  is  the  development  of  an  algorithm 
which  extends  FDTD  and  non-orthogonal  FDTD  to  periodic  structures.  This  work  was  initiated  by 
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the  Signature  Technology  Laboratory  at  Georgia  Tech.  Research  Institute.  The  difficulty  with 
existing  periodic  FDTD  based  algorithms  was  that  they  were  inherently  unstable.  To  circumvent 
this,  a  new  algorithm  based  on  a  split-field  formulation  has  been  developed  for  both  orthogonal 
and  non-orthogonal  FDTD  methods  that  efficiently  treats  the  scattering  and  radiation  by  periodic 
structures  that  is  stable  for  all  angles  of  incidence,  and  highly  efficient. 

5.  SUMMARY  OF  IMPORTANT  RESULTS 
I.  Accomplishments 

As  per  meeting  the  objectives  outlined  in  Section  4,  a  summary  of  the  specific  accomplishments 
of  this  research  program  are  encapsulated  as  follows: 

1)  Development  of  the  planar  generalized  Yee-algorithm  (PGY),  an  explicit  time-domain 
solution  of  Maxwell's  equations  that  assumes  a  volume  discretization  based  on  an 
unstructured  grid  [1-5].  The  PGY  algorithm  exploits  the  planar  symmetry  of  microwave 
circuits  composed  of  planar  stratified  circuits  and  vias  to  greatly  reduce  the  memory 
overhead.  This  algorithm  has  been  successfully  applied  to  large  circuits  and  antennas. 

2)  The  development  of  a  test  of  well-posedness  for  explicit  time-dependent  solutions  of 
Maxwell’s  equations  based  on  staggered  grids.  From  this  theory,  well-posed  PGY 
algorithms  and  non-orthogonal  FDTD  algorithms  have  been  successfully  derived. 

3)  The  development  of  an  unconditionally  stable  implicit  finite-element  time-domain  (FETD) 

solution  of  the  inhomogeneous  vector  wave  equation  [6, 7].  ^ 

4)  The  development  of  a  highly  scalable  parallel  algorithms  for  the  finite-difference  time- 
domain  (FDTD)  method,  the  generalized  Yee  and  PGY  algorithms,  and  the  FETD  method 
[8-12]. 

5)  The  development  of  a  scalable  parallel  algorithm  for  the  solution  of  sparse  matrices  arising 
fi-om  the  FEM  discretization  of  the  vector  wave  equation.  The  technique  is  based  on  the 
method  of  Lagrange  Multipliers,  and  has  been  used  for  both  frequency  dependent  [  1 3]  and 
time-dependent  FEM  simulations  [7]. 

6)  The  development  of  a  uniaxial  perfectly  matched  layer  medium  (UPML)  for  the  truncation 
of  FDTD  lattices.  This  technique  provides  absorption  on  the  order  of  90  dB  over  broad 
frequency  bands.  The  UPML  has  been  successfully  applied  to  inhomogeneous  media, 
lossy  media,  dispersive,  and  non-linear  media  [14,  15],  as  well  as  to  the  non-orthogonal 
FDTD  algorithm  [16],  and  the  frequency-domain  FEM  method  [13]. 

7)  This  work  has  helped  to  promote  collaborative  efforts  with  engineers  at  Hughes  Aircraft 
Company,  Georgia  Tech  Research  Institute  and  the  Jet  Propulsion  Laboratory. 
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8)  This  effort  has  supported  3  M.S.  students,  2  Ph.  D.  students,  and  a  post  doctoral 

associate.  , 

9)  Based  on  this  effort  12  journal  articles  and  5  book  chapters  have  been  published,  3  other 
journal  articles  are  currently  under  review,  and  12  papers  have  been  published  in  the 
proceedings  of  International  Symposia.  ^ 

In  the  remainder  of  this  report,  summaries  of  these  accomplishments  are  provided.  Specifically, 
in  Section  II,  the  explicit  planar  generalized  Yee-algorithm  is  summarized  with  some  examples  of  a 
few  of  the  many  microwave  circuits  analyzed  with  this  technique.  Section  III  introduces  the 
concept  of  well-posedness,  and  test  for  well  posedness  for  general  Yee  schemes  based  on 
conformal  and  nonorthogonal  grids.  It  is  shown  that  originally  posed  algorithms  such  as  the 
contour  patch  FDTD  method,  the  nonorthogonal  FDTD  method,  and  the  PGY  method  are  actually 
ill-posed.  Well-po’sed  methods  are  proposed  and  demonstrated.  Section  IV  summarizes  the 
uniaxial  PML  method  and  its  implementation  within  the  FDTD  method.  It  also  demonstrates  that 
the  uniaxial  PML  can  be  Used  to  match  generalized  inhomogeneous  lossy  media.  Through 
validation  its  effectiveness  and  application  within  the  FDTD  method  is  presented. 

Sections  V  and  VI  deal  with  implicit  schemes  based  on  the  finite-element  method  (FEM). 
Section  V  presents  a  parallel  frequency  dependent  FEM  with  a  uniaxial  PML  mesh  termination. 
The  parallel  scheme  is  based  on  the  method  of  Lagrange  multipliers  and  excellent  speedups  are 
recorded.  This  method  is  extended  to  an  unconditionally  stable  implicit  finite  element  time-domain 
(FETD)  method  in  Section  VI. 

Section  VII  extends  the  nonorthogonal  FDTD  method  to  periodic  structures.  To  this  end,  a 
stable  FDTD  analysis  based  on  orthogonal  grids  is  first  introduced.  The  advantage  of  the  method 
introduced  is  that  it  is  inherently  stable  for  all  angles  of  incidence,  unlike  previous  methods.  This 
method  is  then  extended  further  to  a  nonorthogonal  grid  FDTD  method. 

Finally,  section  VIII  discusses  the  incorporation  of  linear  and  nonlinear  lumped  devices  into 
FDTD  and  PGY  simulations  using  the  state  variable  methods.  The  application  of  a  linear  amplifier, 
a  nonlinear  diode,  and  a  nonlinear  amplifier  are  presented. 


4 


S.  Gedney,  Rigorous  Analysis  of  Large  Scale  MMIC  Circuit  Devices 


II.  Explicit  Planar  Generalized  Yee  Algorithm  ^ 

A  number  of  explicit  methods  for  the  solutions  of  Maxwell’s  equations  based  on  non-orthogonal 
structured  [17,  18]  and  unstructured  methods  [1-5,  19]  have  been  proposed.  The  planar 
generalized  Yee-algorithm  (PGY)  algorithm  [3-5]  has  the  advantage  that  it  models  three- 
dimensional  geometries  with  planar  symmetry  via  a  mesh  that  is  fully  unstructured  along  two- 
dimensional  cross-sections  and  structured  along  the  third  dimension.  The  PGY  algorithm  is  based 
on  the  discretization  of  Ampere's  and  Faraday's  laws  in  their  integral  form  by  projecting  the  vector 
fields  onto  the  edges  of  a  dual,  staggered  unstructured  grid.  By  exploiting  planar  symmetry, 
significant  memory  savings  can  be  realized.  Based  on  such  a  discretization  of  the  fields,  and 
employing  a  central  difference  approximation  for  the  time-derivatives,  an  explicit  time-marching 
solution  for  vector  field  updates  can  be  derived.  The  matrices  are  highly  sparse,  although  due  to 
the  unstructuredness  of  the  grid  they  must  be  stored.  Fortunately,  due  to  the  regularity  of  the  grid 
along  one  dimension,  only  the  matrices  due  to  one  layer  of  cells  are  actually  stored.  This  greatly 
reduces  the  overall  memory  requirement.  The  explicit  update  scheme  is  then  expressed  as  a  linear 
operator  as 

(2) 

where  b  and  d  are  the  vector  of  discrete  vector  flux  densities  associated  with  the  primary  and 
secondary  cell  faces,  respectively,  the  superscripts  refer  to  discrete  time,  Q  and  C/,  represent  the 
discrete  contour  integrals  of  the  electric  and  magnetic  fields  about  primary  and  secondary  cell  faces, 
respectfully,  De  is  a  diagonal  matrix  with  entries  representing  the  inverse  of  the  relative 
permittivity,  and  and  Aj  are  the  projection  matrices.  The  role  of  the  projection  matrices  are  to 
project  the  normal  face  fields  onto  the  dual  edges  passing  through  the  faces.  Note  that  for 
simplicity  the  domain  is  assumed  to  be  lossless  and  non  magnetic.  The  explicit  field  updates 
provide  an  extremely  efficient  computational  technique  that  is  second-order  accurate  for  the 
simulation  of  the  time- varying  fields,  and  is  stable,  providing  the  time  step  satisfies  the  stability 
criterion  [3-5]. 

An  example  of  the  simulation  of  a  fields  within  a  Wilkinson  power  divider  is  first  presented. 
Figure  1  illustrates  the  geometry  of  a  Wilkinson  power  divider  designed  for  3  dB  power  division 
(equal  phase)  at  32  GHz  that  is  matched  to  50  fi  microstrip  lines.  A  100  chip  resistor  is  placed 
between  ports  2  and  3  for  isolation.  The  microstrip  device  is  printed  on  a  15  mil  TMM  substrate 
(£,-  =  3.25)  that  is  backed  by  a  ground  plane.  A  two-dimensional  unstructured  mesh  used  to 
analyze  this  circuit  consisted  of  roughly  3200  arbitrarily  shaped  quadrilaterals.  Along  the  vertical 
direction,  the  resultant  3-dimensional  mesh  was  25  cells  high  (uniformly  spaced).  The  fields  on 
the  exterior  boundary  are  updated  using  the  second-order  Higdon  boundary  condition.  Figure  2 
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illustrates  the  S-parameters  S//  and  52/  computed  for  this  device  using  the  PGY-algorithm.  It  is 
noted  that  331  =  821  due  to  geometric  symmetry.  These  results  are  also  compared  with  simulated 
results  using  the  FD-TD  algorithm.  The  FD-TD  model  of  the  Wilkinson  power  divider  was  based 
on  a  uniform  orthogonal  grid  (Ax  =  0.0573  mm,  Ay=  0.0573  mm,  and  Az=  0.0635  mm).  The 
curved  boundaries  were  approximated  by  a  staircase  approximation,  resulting  in  discretization 
error.  Furthermore,  due  to  the  uniformity  of  the  grid,  the  length  of  the  quarter-wave  transformers 
(the  curved  sections)  were  slightly  lengthened.  The  combination  of  these  lead  to  an  upward  shift  in 
the  resonant  frequency  and  the  other  deviations  from  the  PGY  simulated  results.  r 

Asa  second  example,  consider  the  simulation  of  a  transition  from  a  50  n  grounded  coplanar 
waveguide  (GCPWG)  to  a  50  Q  microstrip  line.  The  structured  was  printed  on  a  25  mil  Alumina 
substrate,  and  the  GCPWG  consisted  of  a  center  conductor  that  was  10  mils  wide,  two  symmetric 
outer  fins  40  mils  wide,  with  a  gap  of  5  mils  between  the  center  strip  and  the  outer  fins  (see  Fig. 
3).  Circular  vias  spaced  50  mils  on  center  interconnected  the  fins  to  the  ground  plane.  Figure  4 
illustrates  a  comparison  of  results  simulated  using  the  PGY  algorithm  with  measured  results 
performed  by  Hughes  Aircraft  Company.  Overall,  good  agreement  is  observed.  It  is  noted  that 
the  PGY  simulation  assumed  a  lossless  structure,  where  ~0.6  dB  of  conduction  loss  at  18  GHz  is 
expected  in  S21. 


Fig.  1  Ka-band  Wilkinson  power  divider  (32  GHz).  Source:  S.  Gedney  and  F.  Lansing,  IEEE 
Transactions  on  Microwave  Theory  and  Techniques. 
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f  (GHz) 


f  (GHz) 


(a)  .  (b) 

Fig.  2  Magnitude  of  the  S-parameters  of  a  Wilkinson  power  divider  computed  using  the  PGY 
and  the  FD-TD  methods,  (a)  |S2l|.  (b)|S2l|.  Source:  S.  Gedney  and  F.  Lansing, 


Fig.  3  Abrupt  Transition  from  50  Q  GCPWG  to  50  Q  microstrip  to  50  Q  GCPWG  printed  on,a  25 
mil  Alumina  Substrate,  (a)  Cross  section  of  the  GCPWG  circuit  with  50  mil  spaced  vias.  (b) 
Top  view  of  the  abrupt  transition  (fabricated  by  Hughes  Aircraft). 


f  (GHz) 

Fig.  4  Computed  and  Measured  S-parameters  for  a  GCPWG  -  microstrip  transition  (Measurements  by 
Hughes  Aircraft) 
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III  Well  Posed  Non-Orthogonal  FDTD  Methods 

1.  Well  posedness 

The  coupled  difference  equations  in  (1)  and  (2)  are  explicit  in  nature  and  are  conditionally  stable. 
These  equations  are  said  to  be  stable,  providing  the  time  step  satisfies  the  criterion  [20]: 

2 


Ar< 


(3) 


A^maxW) 

where 

(4) 

This  stability  condition  is  a  necessary  but  not  a  sufficient  condition  to  ensure  the  strict  stability 
of  the  coupled  difference  equations  in  (1)  and  (2).  The  formulation  must  also  be  well  posed  [21]. 
If  the  formulation  is  not  well  posed,  then  it  can  be  unconditionally  unstable.  The  definition  of  well 

posedness  will  be  based  on  the  matrix  M.  ^ 

DeHnition.  The  coupled  system  in  (1)  and  (2)  is  symmetric  hyperbolic  if  M  is  a  Hermitian 
matrix.  It  is  called  strictly  hyperbolic  if  the  eigenvalues  are  real  and  distinct,  strongly  hyperbolic  if 
the  eigenvalues  are  real  and  thei^e  exists  a  complete  system  of  eigenvectors,  and  weakly  hyperbolic 
if  the  eigenvalues  are  real  [2 1 ,  Sect.  6.3].  The  system  will  be  ill  posed  if  it  is  not  at  least  strongly 
hyperbolic. 

To  aide  in  the  understanding  of  the  above  definition,  it  is  instructive  to  perform  an  eigenvalue 
analysis  of  the  coupled  explicit  equations.  To  this  end,  the  vector 


w”  = 


b” 


is  introduced  such  that  (1)  and  (2)  are  reposed  as  a  first-order  difference  equation: 
w"  =  Gw”-\ 

where  G  is  referred  to  as  the  growth  matrix,  and  is  more  explicitly  written  as: 

I  I 

AtCf^Ai,  I  -  /sP'Ci^Ai^CgD^Aj  _ 

It  can  be  shown  that  the  eigenvalues  of  G  are  expressed  as  [20]: 


A^  =  l- 


At^A, 


'1 


-2 


Ar^A„ 


(5) 

(6) 

(7) 

(8) 


where  A„,  are  the  eigenvalues  of  M.  If  the  time  step  satisfies  (3)  and  A„,  is  positive  real,  then  the 
term  within  the  radical  will  be  negative.  Thus,  (8)  can  be  rewritten  as 


A/7  — 
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It  is  seen  inunediately  that  =  1.  This  is  true  for  all  that  are  real  and  >  0  providing  (3)  is 

satisfied.  Interestingly,  this  is  expected  since  without  dissipation  the  total  energy  in  the  system  is 
unchanged  with  time.  If  (3)  is  not  satisfied,  the  term  under  the  radical  sign  is  positive,  and  it  is 
immediately  seen  that  of  the  two  eigenvalues  of  G  derived  from  (8)  will  be  purely  real  and  at  least 

one  of  them  will  lie  outside  of  the  unit  circle. 

If  the  system  is  weakly  hyperbolic,  then  a  complete  set  of  eigenvectors  does  not  exist  and  G  will 
be  ill-conditioned  which  can  lead  to  instabilities.  In  such  a  case,  an  eigenvalue  analysis  is  not 
sufficient  to  determine  stability.  Following  a  pseudo  spectral  analysis  it  can  be  seen  that  an  ill- 
conditioned  system  can  go  unstable  despite  its  eigenvalues  being  on  the  unit  circle  due  to  the 
sensitivity  of  the  system  to  small  perturbations  [22, 23]. 

Finally,  if  M  has  any  complex  eigenvalues  it  can  easily  be  shown  that  the  system  will  be 
unconditionally  unstable.  Specifically,  assume  that  there  exists  a  that  is  complex.  Assuming 
that  Ar^jA^I  <  4,  it  can  be  shown  that  the  eigenpair  for  Xq  derived  from  (8)  has  one  solution  inside 
the  unit  circle  and  the  other  is  outside  the  unit  circle.  As  an  example,  let  At^|A,„|  =  \±j\.e-2>, 
then  Xq  =  1.000577Z  ±  60®,  0.999423Z  ±  60®.  Having  a  Xg  occur  outside  of  the  unit  circle  will 
lead  to  instability  since  stability  requires  that  |G|  <  1,  which  obviously  is  no  longer  satisfied. 

The  question  that  remains  is  under  what  practical  conditions  will  (1)  and  (2)  be  well  posed. 
Initially,  consider  the  simplest  case  of  a  dual  orthogonal  grid  that  is  cubic  (i.e.,  Ax  =  Ay  =  Az). 
Under  such  conditions.  Mis  a  real  symmetric  matrix.  Subsequently,  the  system  is  symmetric 
hyperbolic,  and  is  well  posed.  For  a  non  cubic  grid,  M  is  not  symmetric.  Nevertheless,  it  is  easy 
to  show  that  M=  C/,Ce  can  be  symmetrized  through  a  diagonal  transformation  and  the  FDTD 
method  is  strongly  hyperbolic. 

Next,  assume  that  the  mesh  is  irregular  and  non-orthogonal.  For  sake  of  illustration  neglect  the 
projections  Ab  and  Ad  such  that  M  =  Interestingly,  due  to  the  complementary  nature  of 

the  line  integrals  over  the  dual  grid  the  system  is  strongly  hyperbolic.  This  is  true  even  for 
inhomogeneous  media.  Unfortunately,  this  formulation  is  only  first-order  accurate  for  general 
grids.  Nevertheless  it  provides  useful  insight  to  the  more  general  formulation. 

Finally,  for  the  most  general  case  when  the  grid  is  irregular  and  non  orthogonal  and  the 
projection  operators  are  included,  M  is  non  symmetric.  Because  of  the  product  of  matrices,  even  if 
is  symmetric,  or  is  symmetrizable,  M  may  not  be.  If  the  matrices  At  and  Ad  are  non- 

symmetric,  M  will  likely  have  complex  eigenvalues.  As  originally  proposed,  the  projection 
operators  Ad  and  Ab  derived  from  the  NFDTD  or  DSI  methods  are  non-symmetric  if  the  grid  is 
irregular  and  non  orthogonal.  Subsequently,  both  formulations  can  potentially  suffer  from  late 
time  instabilities.  This  is  demonstrated  by  Roden  [24]. 
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If  one  can  pose  a  technique  based  on  the  DSI/GY  algorithms  or  NFDTD  algorithms  that  leads  to 
M  having  real  and  distinct  eigenvalues,  then  the  problem  can  be  recast  in  a  well  posed  manner.  It 
is  observed  that  if  Ad  and  Ab  are  symmetric  and  well  conditioned  and  £/•  is  constant  such  that 
Ds  =  Sr  I,  then  M  will  have  real  and  distinct  eigenvalues  (what  defines  well  conditionedness  of 
and  Ad  will  be  described  in  the  next  section).  Interestingly,  if  Ad  and  Ab  are  symmetric  the  reaction 
between  neighboring  fields  due  to  the  projection  operators  are  equivalent.  Methods  by  which  to 
derive  symmetric  projection  matrices  without  sacrificing  accuracy  are  presented  in  [20]. 

Unfortunately,  if  £,-  is  inhomogeneous  this  may  introduce  complex  eigenvalues  in  M  even  liAb 
and  Ad  are  symmetric.  This  is  principally  due  to  the  non-reciprocal  nature  that  results  from  the 
projections  (e.g.,  DsAd  ^  AdDe).  An  effective  way  to  force  symmetry  in  the  projection  operator  is 
to  approximate  (6.49b)  as: 


.4 


b"  =  -  AtC^DI  AjDl  d  ^ 


Then,  G  is  rewritten  as: 


-htc,iyiA,iyl 


MC,,Af,  I-At^ChAt,C,DlAjDj 


M»M,  =  ChAi,CgDlAjDl.  (12) 

It  can  be  shown  that  if  and  4  i  are  symmetric  and  well  conditioned,  then  has  real  and 

distinct  eigenvalues  even  for  inhomogeneous  medium.  ^  ^ 

It  must  be  realized  that  is  an  approximation.  The  projection  operation  e  =  D^A^D]  d  maps 
the  normal  flux  density  to  the  field  intensity  projected  on  the  edge  passing  through  the  face. 
Examining  the  i-th  entry  of  e: 


, 


where,  a^.  is  the  entry  of  matrix  Ad  in  the  i-th  row  andy-th  column.  The  diagonal  term  of  the 
projection  operation  is  unchanged  by  the  use  of  M^.  Whereas,  the  off  diagonal  terms  are 
normalized  by  an  effective  permittivity  expressed  as  the  geometric  mean  of  the  permittivities  of  the 
two  adjacent  edges.  While  this  is  an  approximation,  it  is  demonstrated  in  Section  6.5.4  that  this 
approximation  does  not  degrade  the  accuracy  of  the  NFDTD  or  DSI/GY  algorithms.] 
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2.  Validation 


A  simple  example  is  now  presented  to  study  the  accuracy  of  the  symmetric  projection  operators. 
To  this  end,  a  benchmark  test  case  of  a  circular  dielectric  ring  in  a  rectangular  PEC  cavity  is  studied 
[25].  The  geometry  is  provided  in  Fig.  5.  Both  unstructured  and  structured  grids  were  generated 
to  analyze  this  problem  using  the  DSI  and  NFDTD  algorithms,  respectively.  A  cross-section  of  the 
structured  and  unstructured  grids  are  illustrated  in  Fig.  4. 

'  The  fields  in  the  resonant  cavity  were  simulated  using  the  original  NFDTD  algorithm  and  the 
well  posed  NFDTD  algorithm  as  well  as  the  original  PGY  algorithm  and  the  well-posed 

algorithm  PGY.  The  fields  in  the  cavity  were  driven  by  injecting  a  vertically  oriented  current 
density  placed  at  a  non  symmetric  point  described  by: 

{t-t,,)- 


J(t)  =  -z 


2{t-0 

K- 


e 


where  =  0.2122  ns,  and  to  =  Stn-  The  vertical  field  was  probed  in  the  cavity  and  the- time 
simulation  was  performed  for  25,000  time  steps  with  ^t  =  4.5  ps.  The  vertical  field  was  Fourier 
transformed  using  an  FFT,  and  the  resonant  frequencies  were  extracted.  Table  1  presents  the 
resonant  frequencies  for  the  first  4  modes  as  calculated  using  the  symmetric  PGY  and  NFDTD 
methods  and  the  measured  dominant  mode  [25].  These  results  are  also  compared  to  those  obtained 
using  an  implicit  FETD  method  [6].  The  results  were  also  obtained  using  the  non-symmetric  PGY 
and  NFDTD  algorithm  The  resonant  frequencies  compared  to  within  0.1  %  as  compared 

to  the  well  posed  methods.  It  is  further  noted  that  for  this  case,  the  non-symmetric  PGY  method 
ran  for  30,000  time  steps  before  going  unstable.  The  symmetric  PGY  and  NFDTD  methods  ran 
for  250,000  time  steps  and  still  showed  no  signs  of  instability. 

The  dielectric  ring  in  cavity  problem  was  repeated  when  the  dielectric  relative  permittivity  was 
increased  to  9.8.  For  this  case,  the  grid  density  in  and  near  the  ring  was  roughly  doubled  to 


properly  resolve  the  fields.  The  calculated  resonant  frequencies  are  presented  in  Table  2.  Again, 
the  symmetric  DSI/GY  simulation  was  stable  for  over  250,000  time  steps.  Interestingly,  the 
symmetric  NFDTD  simulation  did  eventually  go  unstable  in  the  very  late  time  for  this  geometry. 
Observing  the  refined  NFDTD  lattice,  it  was  observed  that  there  were  four  cells  with  highly  obUise 
interior  angles.  This  lead  to  an  ill-conditioned  matrix  as  discussed  in  the  previous  section.  This  is 
a  penalty  of  structured  gridding. 
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5  Dielectric  ring  in  a  rectangular  PEC  cavity  (a  -  324  mm,  b  -  12\  mm,  c  -  43  mm. 
The  ring  is  centered  at  \vl  =  207.25  mm,  w2  ~  116.75  mm,  b/2  along  the  y-direction 
and  rests  on  the  ground  plane.  The  ring  has  a  height  of  h  =  39  mm,  inner  radius  /■/  — 
16.65  mm,  and  outer  radius  =  26.75  mm.) 


Table  1 


Resonant  Frequencies  of  the  Dielectric  Ring  Loaded  Cavity,  Sy  -  2.06 


Mode 


kOl 


k02 


k03 


k 


Meas. 


WP-DSI 


1.258  GHz 


1.509  GHz 


1.836  GHz 


2.158  GHz 


1.509  GHz 


FETD 


1.259  GHz 


1.512  GHz 


2.161  GHz  I  2.175  GHz 


Table  2 

Resonant  Frequencies  of  the  Dielectric  Ring  Loaded  Cavity,  Sr  —  9.8 


Mode 

kOl 

k02 

k03 

k04 


WP-DSI 

0.9520  GHz  0.9520  GHz 
1.415  GHz"  1.415  GHz 


1.608  GHz 
2.024  GHz 


1.612  GHz 
2.026  GHz 


FETD 

0.9518  GHz 
1.420  GHz 
1.615  GHz 
2.034  GHz 
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IV  Uniaxial  Perfectly  Matched  Layer  Truncation  of  Computational 
Space  for  Simulation  of  Unbounded  Domains. 

1.  Introduction 


Explicit  time-domain  methods  such  as  the  finite-difference  time-domain  (FDTD)  method  and  the 
^  planar  generalized  Yee  (PGY)  algorithm  have  been  highly  effective  for  the  analysis  of  practical 
microwave  circuit  devices  and  antennas.  One  of  the  most  challenging  aspects  of  these  methods  is 
implementing  absorbing  boundary  conditions  that  can  accurately  truncate  the  mesh  over  broad 
frequency  bands.  The  perfectly  matched  layer  (PML)  absorbing  media  introduced  by  J.-P. 
Berenger  [26]  has  been  demonstrated  to  be  a  highly  effective  method  for  the  termination  of  FDTD 
lattices  [27,28].  and  can  result  in  reflection  errors  as  minute  as -80  dB  to -100  dB.  Recently,  it 
has  been  shown  that  the  PML  method  can  be  reposed  in  a  Maxwellian  form  as  a  uniaxial 
anisotropic  medium  [14,  15,  29].  It  has  been  demonstrated  that  the  uniaxial  medium  can  be 
perfectly  matched  to  lossy,  inhomogeneous,  dispersive,  isotropic  and  anisotropic  medium.  Most 
significant  is  that  the  extension  to  such  complex  media  in  a  FDTD  implementation  is  quite  trivial. 
Furthermore,  since  the  uniaxial  PML  formulation  is  Maxwellian  and  not  restricted  to  an  orthogonal 
field  splitting  and  it  can  be  easily  extended  to  more  generalized  methods  such  as  the  non-orthogonal 
FDTD  method  [16].  The  focus  of  this  summary  will  be  on  the  efficient  implementation  of  the 
uniaxial  PML  method  and  the  expected  accuracy.  The  simple  extension  to  more  complex  media 
will  also  be  discussed  in  the  context  that  its  implementation  within  an  existing  FDTD  code  is  quite 
trivial. 

2.  The  Uniaxial  PML 


It  was  shown  in  [14,  15,  29]  that  an  arbitrary  polarized  wave  incident  on  a  planar  half  space 
(defined  by  the  z  =  0  plane)  will  be  perfectly  transmitted  provided  that  the  half  space  is  a  uniaxial 
medium  with  constitutive  parameters 


\ 

o 

o 

_ J 

O 

o 

_ 1 

S  — 

0  ^2  0 

0  0 

,  P  =  PoPr 

0  5.  0 

_o  0  s:\ 

where  Sr  and  are  defined  by  the  medium  of  the  upper  half  space.  It  can  be  demonstrated  that  this 
perfectly  transmitting  property  is  still  valid  if  the  upper  half  space  is  inhomogeneous,  lossy, 
dispersive,  and  even  anisotropic. 

The  intention  of  the  PML  medium  is  to  rapidly  attenuate  waves  entrant  into  the  medium.  Thus  a 
suitable  choice  for  i'- is: 


=  Kr  + 


CT- 

7^' 


(15) 
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Given  the  propagation  constant  of  the  incident  wave  to  be  yi  =  ai  +7^;,  then  it  can  be  shown  that 
the  propagation  constant  of  the  transmitted  wave  in  the  uniaxial  PML  region  is  [  1 5] 

The  real  part  of  yf  leads  to  the  attenuation  of  the  wave.  Thus,  the  loss  term  ov  will  lead  to  the 
attenuation  of  all  propagating  modes,  and  the  real  term  kv  will  lead  to  the  amplification  of  the 
attenuation  of  all  evanescent  modes  incident  on  the  planar  interface.  As  a  result,  this 
parameterization  is  analogous  to  the  generalized  PML  introduced  by  Fang  et  al.  [30],  which  is  an 
extension  of  the  original  Berenger  formulation. 

The  uniaxial  PML  can  be  used  to  terminate  a  three-dimensional  FDTD  space  accurately  and 
efficiently.  To  this  end,  the  FDTD  lattice  is  terminated  on  all  6  sides  via  planar  PML  media,  which 
are  backed  by  a  PEC  wall.  Under  such  circumstances,  it  is  recognized  that  the  planar  interfaces 
will  overlap  in  what  is  referred  to  as  the  comer  regions.  In  order  to  derive  the  constitutive  relations 
in  these  comer  regions,  it  is  reasonable  to  match  the  PML  to  the  adjacent  uniaxial  medium.  For 
example,  let  the  upper  half  space  have  permittivity  and  permeability  tensors: 


^1  “  ^o^r 


0  0^  5;'  0  0 

0  ,  ^,  =  0  fv  0 

b  0  0 


Then,  assuihing  an  arbitrary  polarized  plane  wave  will  be  perfectly  transmitted  into  the'  lower  half 
space  separated  by  a  z  =  constant  plane  if  the  permittivity  and  permeability  tensors  are  defined  as: 

’5;'  0  0  0  1  _  \s;'  0  Ol  s,  0  0 

£2=S^€,  0  0  0  5-0  ,^l2=^ot^r  0  5,^  0  0  5-  0  .  (18) 

[0  6  5,1 0  0  5:'J  ^  0 

This  can  be  further  generalized  to  ay  interface,  leading  to  the  general  anisotropic  tensors 


0  0 


s  =  jj,  =  and  s  =  0 


0  0 


where, 


5^  =  + 


a  1  f  cr  \  cr, 

jcoej  '  '  jcoej  '  y  jcos^ 


Within  this  uniaxial  region.  Maxwell's  curl  equations  are  expressed  as: 

V  X  H  =  j(OSg£r(G>)lE,  V  X  E  =  (21) 

It  is  from  these  equations  that  the  explicit  field  updates  will  be  derived.  As  an  example,  consider 
the  update  expression  for  £V.  Introducing  the  constitutive  relation 
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where  Sr  is  assumed  to  be  a  function  of  x  and;^,  it  follows  from  (6)  -  (8): 
Subsequently,  this  leads  to  the  discrete  field  update  for  D-: 


£)"+i  =^L_i£iL£)"4  + - ! -  //'!  -H"  ,  /Ax-  //;'  ,  ,-//?  ,  , 

■/.y.A+i  f:L  +  ^U  -.vi/A+i  IJ- 2-^*2 


Given  D;,  an  auxiliary  relationship  can  be  derived  for  £■;  as 
s,D,  =  e,s,s,E,,or 

jcoK.D^  +  ^D.  =  SqS,.  ( jcoK^^E.  +  ^  E.). 

Then,  transforming  (25)  into  the  time-domain,  approximating  the  time  derivatives  using  a  central 
difference  approximation,  and  averaging  ^D.  and  ^E.  in  time,  this  results  in  a  second-order 

accurate  explicit  update  equation: 


£"4  ^£”4  ^ + - K— 

'  •^■■*4  "'•^■•*4  (K,  y  (K,  +  ^)eoer 


/.y.A+4  28q 


Similar  update  equations  can  be  derived  for  the  remaining  field  components. 

The  uniaxial  PML  method  is  easily  extended  to  more  general  media  such  as  lossy  media, 
dispersive  media,  or  non-linear  media.  This  can  easily  be  done  through  the  addition  of  an 
additional  auxiliary  equation(s)  [15],  which  is  a  simple  extension  of  the  above  algorithm. 


S.  Matching  PML  to  Generalized  Media 

In  the  previous  section,  the  PML  was  matched  to  a  homogeneous  medium  supporting  plane 
wave  type  solutions  impinging  on  the  PML  boundary.  Because  the  PML  is  matched  for  all  angles 
of  incidence  and  polarizations,  one  could  generalize  this  to  arbitrary  finite  sources,  or  scattered 
fields  emanating  from  complex  geometries  by  using  plane  wave  expansions  of  the  fields. 
However,  a  more  general  proof  is  needed  to  demonstrate  its  effectiveness  for  matching  to  waves 
propagating  in  inhomogeneous  media.  This  is  the  focus  of  this  section. 

Consider  a  wave  propagating  in  an  inhomogeneous  media  along  an  axial  direction,  chosen 
arbitrarily  to  be  the  z-axis.  It  is  assumed  that  the  material  medium  is  inhomogeneous  along  the 
transverse  axes  (x,y),  and  invariant  along  the  axial  direction.  The  electromagnetic  waves 
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supported  by  such  a  medium  can  be  decomposed  into  TEz  and  TMz  polarized  waves.  In  this 
medium,  we  pose  time-harmonic  solutions  to  the  wave  equation  of  the  form: 

TE^:  H,=h,{x,y)e~^^--  (27) 

TM^;  E,=e,(x,y)e~^^\  (28) 

where,  =  0  for  the  TEz  polarized  waves,  and  /f-  =  0  for  the  TMz  polarized  waves.  From 
Maxwell's  equations,  the  remaining  fields  can  be  derived  from  the  axial  fields  as: 


^  +  yl  dy 

dH. 


H  -y-- 

^  +  y^  dx  ' 


(29) 


k^  ^yi  dx 

£■-  =  0, 

for  the  TEz  polarization.  The  TMz  fields  are  derived  from  (28)  leading  to  a  dual  expression  of 
(29). 

A  PML  half  space  is  interfaced  with  the  inhomogeneous  medium  in  the  z  =  0  plane.  It  is 
assumed  that  the  transversely  inhomogeneous  material  profile  extends  through  the  PML.  From 
( 1 9),  the  PML  has  the  constitutive  relations  related  to  the  z-normal  interface: 


r 

O 

O 

O 

1 _ 

0  ■ 

e  =  s(x,y) 

0  0 

,  n  =  n(x,y) 

0  5. 

0 

0  0 

o 

o 
_ 1 

where  s  and  in  can  be  complex.  The  generalized  wave  equation  is  then  derived,  and  the  axial  field 
solutions  are  then  derived  for  the  dual  polarizations: 

TE^:  H,=s.h.{x,y)e'^-‘^'-\  (20) 

TM^:  £.  =5.e.(x,y)e"^-’’'-”.  (20 

where,  hz{x,y^  and  S2(x,y)  are  identical  to  those  in  (27)  and  (28),  respectively.  Note  that  the 
weighting  the  normal  fields  is  consistent  with  Gauss'  law^f.  For  the  TEz  polarized  case,  the 
remaining  fields  are  again  derived  from  Maxwell's  equations  in  the  uniaxial  medium,  leading  to. 

1  dH. 


_  -join 

-  ,2  .  ..2 


k^  +  yj  s.  dy 
jeon  ) 


ffx=  — 


1  dH. 


1^2  .  ..2 


~  k2  .  ..2 


k^  +  y2  Sr  dx 

£.=0, 


77v  = 


+  yt  s.  dx 
-y,  \  dH, 

+  Yz  ^ 


(32) 


The  TMz  fields  are  simply  the  dual  of  (32). 

Finally,  it  is  seen  that  inserting  the  axial  fields  in  (30)  in  (32),  and  (27)  in  (29),  the  transverse 
fields  are  continuous  across  the  z  =  0  plane.  Furthermore,  it  is  seen  that  the  wave  transmitted  into 
the  PML  medium  has  the  same  characteristic  wave  impedance  relative  to  the  axial  direction  as  that 


16 


S.  Gedney,  Rigorous  Analysis  of  Large  Scale  MMIC  Circuit  Devices 


of  the  impinging  waves  for  both  polarizations.  Subsequently,  the  proposed  uniaxial  medium  is 
perfectly  matched  to  the  inhomogeneous  space. 


4.  Reflection  Error 

The  preceding  theoretical  analyses  have  assumed  the  PML  to  be  a  half  space.  Employing  the 
PML  to  truncate  the  FDTD  lattice  boundaries,  the  PML  have  a  finite  thickness  and  terminated  by  a 
boundary.  If  the  exterior  boundary  is  assumed  to  be  a  perfectly  electrical  conducting  (PEC)  wall, 
there  will  be  reflected  power  back  into  the  interior  FDTD  region.  The  reflection  can  be  calculated 
using  a  simple  transmission  line  analysis  and  will  have  an  amplitude  of: 

(33)'^ 

where  0  is  the  angle  of  incidence,  d  is  the  thickness  of  the  PML  slab,  r\  is  the  characteristic 
impedance  of  the  reference  material,  and  cris  the  conductivity  of  the  PML.  Within  the  context  of 
an  FDTD  simulation  R{6)  is  referred  to  as  the  "reflection  error"  since  it  is  a  non-physical  reflection 
due  to  the  PEC  backed  PML  slab.  This  reflection  error  decays  exponentially  with  the  thickness  of 
the  PML  as  well  as  with  the  conductivity  of  the  PML.  However,  the  cosine  dependence  of  9 
dictates  that  at  higher  angles  of  incidence,  the  reflection  error  will  increase  for  a  fixed  cr  and  d.  To 
be  effective  within  an  FDTD  simulation,  it  is  desirable  for  the  PML  medium  to  be  as  thin  as 
possible.  Thus,  for  a  small  d,  one  must  have  a  large  conductivity  to  reduce  R{9)  to  an  acceptably 


small  level. 

The  PML  interface  presents  a  discontinuity  in  both  the  electrical  and  magnetic  conductivity.  In 
the  discrete  space  representation  of  Maxwell's  curl  equations  via  the  FDTD  method,  it  is  realized 
that  these  discontinuities  are  modeled  using  a  linear  approximation  [31].  Furthermore,  cr  and  o 
are  staggered  by  one  half  of  a  lattice  cell  due  to  a  staggering  of  the  electric  and  magnetic  fields. 
Subsequently,  large  discontinuities  in  a  and  o*  will  lead  to  significant  discretization  error  which 
will  manifest  as  a  spurious  reflection.  To  reduce  this  reflection  error,  it  was  proposed  by  Berenger 
to  spatially  scale  the  conductivity  profile  along  the  normal  axis  [26].  It  can  be  shown  that  as  long 
as  the  conductivity  is  scaled  along  the  normal  axes,  then  the  material  is  still  perfectly  matched. 

In  the  continuous  space,  assume  that  cr  is  a  function  of  x.  Assuming  a  PEC  backed  PML  slab  of 
thickness  d ,  the  reflection  due  to  a  purely  propagating  wave  impinging  at  angle  9  is: 


(34) 

A  few  profiles  have  been  suggested  for  scaling  cr.  The  most  successful  have  been  a  polynomial 


scaling  and  a  geometric  scaling  [32].  The  polynomial  scaling  is  simply: 


This  scales  the  conductivity  from  0  at  the  PML  —  working  volume  (WV)  interface  to  Cmax  the 
PEC  boundary.  The  resultant  reflection  coefficient  is: 
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^^^^_g-2>7e,<^max</cos0/(/«+l)^  (36) 

Polynomial  scaling  provides  two  parametric  coefficients  for  a  fixed  PML  thickness:  Omax  and  m. 
The  larger  m,  the  flatter  the  conductivity  will  be  near  the  PML  —  WV  interface.  Deeper  into  the 
PML,  the  conductivity  increases  more  rapidly.  In  this  region,  the  field  amplitudes  have  decayed 
substantially  and  the  reflections  due  to  discretization  error  contribute  less.  Typically  ni  in  the  range 

between  3  and  4  has  been  found  to  be  suitable  [14, 32-34]. 

If  the  real  part  of  s  is  >  1,  then  it  can  also  be  spatially  scaled  using  the  polynomial  scaling. 
Thus,  k:  will  scale  from  1  to  some  maximum  value  k^ox  at  the  PEC  boundary.  To  this  end: 

where  Kmax  ^  1 . 

For  polynomial  scaling,  the  parameters  can  be  determined  for  a  given  error  estimate.  For 
example,  given  m  and  d,  it  is  desired  to  achieve  a  reflection  error  of  i?(0),  then  from  (7.60)  cr^ax 

is  computed  as: 

ln(/?(0))(ffl  +  l)  (38) 

2r)e,d  ' 

Geometric  scaling  scales  the  conductivity  profile  geometrically.  Let  Ax  be  the  spatial  increment 
of  the  FDTD  lattice.  Then  [33], 


(T(x)  =  |^g-j  a,. 

Specifically,  the  conductivity  scales  from  (Tq  at  the  PML  —  WV  interface  to  g^Oo  at  the  PEC 
boundary,  where  d  =  VAx  is  the  thickness  of  the  PML  slab.  From  (7.58),  this  results  in  a 

reflection  coefficient  of: 

|j(0)_g-2'7'To^^(«^-Ocos0/lng  (40) 

For  geometric  scaling,  there  are  two  parameters  to  choose  for  a  fixed  d:  g  and  Gq.  Gq  is  the 
conductivity  in  the  jc  =  0  plane,  or  the  PMLAW  interface.  Thus,  it  must  be  small  to  minimize  the 
initial  discretization  error.  The  metric  g  governs  the  rate  of  increase  of  the  conductivity  deeper  into 
the  medium.  Of  course,  g  is  always  greater  than  1.  The  larger  g,  the  flatter  the  conductivity  profile 
near  the  interface,  and  the  steeper  it  increases  deeper  into  the  PML  slab. 

Again,  if  k:  is  >  1 ,  then  it  can  also  be  geometrically  scaled  as: 


Thus,  K  will  scale  from  1  to  some  maximum  value  at  the  PEC  boundary. 
For  geometric  scaling,  one  would  predetermine  g,  d,  and  R(0).  Then, 
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ln(i?(0))ln(g)  (42) 

2riSr^x{g^  -\) 

where  d  =  iVAx  is  the  thickness  of  the  PEC  backed  PML  layer. 

The  design  of  the  optimal  PML  results  in  a  delicate  balance  of  the  theoretical  reflection  error 
R{e)  and  discretization  error.  For  example,  (38)  provides  (Tmax  for  ^  spatially  scaled  conductivity 
given  a  predetermined  R{e)  and  m.  In  practice,  if  a^ax  is  small,  the  predominant  reflection  of  the 

finite  PML  will  be  due  to  PEC  backed  wall.  In  fact,  (36)  provides  a  fairly  accurate  approximation 
of  the  reflection  error  if  cT^jax  is  sufficiently  small.  However,  in  practice  one  would  rather  choose 
o’max  ro  be  as  large  as  possible  to  minimize  /?(0).  Unfortunately,  if  <T^ax  i^  ^oo  large,  then 
discretization  error  due  to  the  FDTD  approximation  will  dominate  and  the  actual  reflection  error 
will  be  orders  of  magnitude  higher  than  what  (36)  predicts.  Subsequently,  there  is  an  optimal 
choice  for  i?(0)  which  balances  reflection  from  the  PEC  wall  and  discretization  error. 

It  was  postulated  by  Berenger  in  [32,  33]  that  the  largest  discretization  error  manifesting  as 
reflection  error  occurs  at  the  PML  interface.  Energy  that  penetrates  into  the  PML  that  is 
subsequently  reflected  will  be  attenuated  before  exiting  the  PML  and  typically  is  not  as  large  a 
contribution.  Thus,  it  is  desirable  to  minimize  the  step  discontinuity  at  the  PML  interface.  One 
way  to  achieve  this  is  by  spatially  scaling  the  conductivity  in  the  PML  as  presented  earlier  in  this 
section.  In  fact,  for  larger  m  or  larger  g,  the  flatter  the  conductivity  profile  near  x  =  0,  thus 
reducing  o:v(0).  However,  if  m  or  g  become  too  large,  then  reflections  from  deeper  within  the 

PML  will  begin  to  dominate. 

For  geometric  scaling,  the  optimal  g  is  typically  between  2  and  3.  For  polynomial  scaling,  the 
optimal  m  is  typically  between  3  and  4.  Through  extensive  experimental  study,  it  was 
demonstrated  in  [14, 35]  that  for  a  broad  range  of  applications  an  optimal  choice  for  a  10  cell  thick 
PML  is  safely  assumed  to  be  R(0) «  For  a  5  cell  thick  PML,  /?(0) «  is  optimal.  From 
(38),  this  leads  to  an  optimal  choice  for  cTmax- 

„  -  (43) 

150;rT/£7Ax 

where  Ax  is  the  grid  spacing  along  the  axial  direction.  For  many  cases,  will  be  close  to 
minimizing  the  reflection  error,  and  this  value  has  proven  to  be  quite  robust  and  applicable  for 
many  applications,  as  will  be  further  illustrated  below. 

5.  Validation 

To  demonstrate  the  effectiveness  of  the  PML,  consider  a  current  source  radiating  in  an 
unbounded  medium,  as  illustrated  in  Fig.  6.  The  source  is  a  vertically  directed  current  source  that 
is  invariant  along  the  axial  z-direction.  Hence,  it  will  radiate  two-dimensional  TEz  waves. 
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r  ^  _ 

A  two-dimensional  adaptation  of  the  FDTD  algorithm  truncated  by  UPML  media  was 

implemented.  The  FDTD  lattice  used  to  simulate  this  problem  consisted  of  a  40  mm  X  40mm 
region.  The  spatial  discretization  used  assumed  Ax  =  Ay  =  1  mm,  and  the  time  step  was  0.98  of 
the  Courant  limit  (At  =  0.92457  ps).  Hence  a  40  x  40  cell  lattice  was  employed  in  the  working 
volume  region.  The  vertical  current  source  was  placed  in  the  center  of  the  computational  domain 
and  had  a  differentiated  Gaussian  pulse  time  signature: 


where  the  pulse  half  width  tw  =  2.65258  X  lO'l  >  s,  and  the  pulse  delay  to  =  4  /w 


(44) 

Note  that  the 


amplitude  is  scaled  by  r„,.  The  electric  field  was  probed  at  two  points,  A  and  B,  as  illustrated  in 
Fig.  6.  A  is  in  the  same  plane  as  the  source  and  two  cells  from  the  PML  interface,  and  B  is  two 
cells  from  the  bottom  and  side  PML  walls.  The  electromagnetic  fields  were  simulated  over  1000 


time  steps,  which  is  well  past  the  steady  state  response. 

Both  a  5  and  10  cell  uniaxial  PML  absorbing  medium  was  used  to  terminate  each  exterior 
boundary.  The  PML  parameters  were  scaled  using  a  polynomial  scaling,  where  m  =  4,  cTmax  = 
obpt,  and  fCmax  “1-  The  relative  error  computed  at  points  A  and  B  due  to  5  and  10  cell  thick  PML 
layers  as  a  function  of  time  is  illustrated  in  Fig.  7.  To  compute  the  relative  error  at  points  A  and  5, 
a  reference  simulation  was  performed  a  priori  using  a  1240  X  1240  cell  lattice  with  the  current 
source  located  at  the  center  of  the  lattice  and  the  fields  at  points  A  and  B  at  the  same  relative 
positions.  The  grid  was  made  sufficiently  large  such  that  no  reflections  from  any  boundaries 
would  be  present  in  the  reference  solution.  It  is  noted  that  the  error  is  related  to  the  maximum  field 


value  at  that  point  during  the  time  simulation. 

Observing  Fig.  7,  it  is  seen  that  the  maximum  error  occurs  in  late  time.  This  error  is 
predominately  low  frequency  error  as  anticipated.  As  time  advances  further,  this  error  gradually 
decreases.  It  is  also  observed  that  the  error  at  point  A  is  always  less  than  that  at  point  B.  This  is 
mainly  because  the  wave  impinging  on  the  wall  near  point  A  is  predominately  normally  incident. 
Thus,  maximum  absorption  is  seen.  At  point  B,  the  amplitude  of  the  wave  is  smaller,  due  to  the 
radiation  pattern  of  the  source,  and  the  wave  impinges  on  the  boundaries  at  45  . 

The  relative  error  in  Fig.  7  was  computed  assuming  a  fixed  cTmax  (  ~  <^opt)*  Additional 


information  can  be  gained  by  observing  the  maximum  error  over  1,000  time  iterations  recorded  at 
points  A  and  B  as  a  function  of  <Tmax-  This  is  illustrated  in  Fig.  8,  where  ovnax  is  varied  from  0  to 
3  times  obpt  for  the  5  and  10  cell  thick  PML  layers.  It  is  noted  that  the  same  ovnax  was  used  for 
each  of  the  four  boundaries,  and  w  =  4  was  again  assumed.  Interestingly,  at  points  A  and  B,  the 
optimal  choice  for  cxmax  is  quite  close  to  Oopt-  Again,  though,  the  maximum  error  in  the  fields  at 
point  B  is  roughly  an  order  of  magnitude  larger  than  the  error  at  point  A  for  5  and  10  cells  thick 
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PMLs.  Fig.  9  illustrates  the  error  for  a  15  cell  PML,  and  it  is  shown  that  for  cTmax  =  cropt,  that  the 
errors  at  both  points  A  and  B  are  the  same,  and  more  than  5  digits  of  accuracy  are  realized. 

A  more  comprehensive  study  of  the  reflection  error  is  offered  by  Fig.  10.  Specifically, 
Figs.  lO.a  and  lO.b  are  contour  plots  of  the  maximum  relative  error  at  points  A  and  5, 
respectively,  recorded  over  1000  time  iterations  versus  the  conductivity  Gmax  normalized  by  Gopt 
as  well  as  the  polynomial  order  m.  It  is  noted  that  m  is  not  necessarily  an  integer,  and  was  scaled 
from  1  to  5.  It  is  observed  that  the  minimum  error  is  found  for  when  m  is  between  3  and  4,  and 
when  Gmax  is  roughly  75  %  of  Gopt-  Even  if  Gopt-  is  chosen,  the  reflection  error  on  the  order  of 

-90  dB  at  point  A  and  -75  dB  at  point  B. 

A  similar  study  is  performed  when  geometric  scaling  is  used  to  scale  the  PML  paratneters. 
Figure  1 1  is  a  contour  plot  of  the  maximum  relative  error  over  1000  time  steps  at  points  A  and  B 
versus  R(0)  and  g  for  a  10  cell  thick  PML.  Interestingly,  for  g  =  2.2,  roughly  -85  dB  of  reflection 
error  is  realized  at  both  points  A  and  B  when  ln(/?(0))  is  between  -12  and  -16.  It  is  also  observed 
that  the  effectiveness  of  the  PML  is  quite  sensitive  to  the  choice  of  g. 

As  a  final  example,  consider  a  printed  antenna  analyzed  using  the  FDTD  and  the  uniaxial  PML 
boundary  condition.  The  geometry  of  the  patch  antenna  that  was  simulated  is  superimposed  in  the 
graph  in  Fig.  12.  This  antennas  was  previously  analyzed  in  [36,  37]  The  discretization  used  for 
this  problem  was  Ax  =  0.389  mm.  Ay  =  0.4  mm,  Az  =  0.1588  mm,  and  At  =  0.441  ps.  The 
source  microstrip  line  was  50  cells  long  and  was  excited  by  a  voltage  source  with  a  Gaussian 
profile  and  a  30  GHz  bandwidth.  The  interfaces  of  the  PML  media  were  placed  3  cells  from  the 
edge  of  the  patch  antenna,  and  5  cells  above  the  surface  of  the  antenna.  The  PML  slabs  were  10 
cells  thick.  A  spatial  scaling  with  m-4,  and  Obpt  from  (7.67)  were  used.  Fig.  7.25  compares  the 
reflection  loss  \Sj]  \  computed  using  the  FDTD  and  the  PML  boundary  that  was  positioned  as 
described  above,  with  the  case  when  the  PML  boundaries  were  placed  much  further  away. 
Negligible  difference  is  observed  in  the  result  over  most  of  the  frequency  band,  riowever,  as 

expected,  there  is  some  small  error  at  very  low  frequency. 
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Fig.  6 


TE-polarized  wave  excited  by  a  vertically  directed  electric  current  source  in  a  two- 
dimensional  region.  The  working  volume  is  40  mm  X  40mm,  and  is  surrounded  by 
PML  layers  of  thickness  d.  The  source  is  located  at  the  center  of  the  region.  The  fields 
are  probed  at  two  observation  points  A  and  B. 


Fig.  7  Relative  error  at  points  A  and  B  over  1,000  time  iterations  for  5  and  10  cell  thick 
PMLs  terminating  a  40  X  40  cell  lattice  excited  by  a  small  dipole  with  crmax  =  c^opt 
and  K=  1. 
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Fig.  8  Maximum  relative  error  due  to  a  5  and  10  cell  thick  PML  termination  of  a  40  X  40 
cell  lattice  excited  by  a  small  dipole  over  1.000  time  iterations  versus  omaxlaopt 
((7.67)).  Polynomial  spatial  scaling  with  m  =  4  and  k-  =  1. 


0.1 

=~T — ^ — 1 — : — 1 — 1 — ! — 1 — : — 

.  ..  i  1  j  :  :  '  ;  .  •  j  - 

0.01 

h, 

\ 

_  \ 

. Q . 1 5  cells  (A)  ^ 

o 

^  0.001 

i 

:  \ 

—  ■+—-  1 5  cells  (B)  : 

CD 

.> 

^  0.0001 

:  \ 

=  \ 

^  k 

CD 

oc 

E  1 

^  i 

1  0'^ 

\ 

_  \ . , 

1  0“® 

!  i  / 

V 

''  * 

.  .  .  I  I  ■  .  >  I  I  ^  ■  i  *  I  >  < 

1  u 

( 

)  0.5  1 

I  1.5  2  2.5  3 

CT  /cr  . 

/  max  opt 

Fig.  9  Maximum  relative  error  due  to  a  15  thick  PML  termination  of  a  40  X  40  cell  lattice 
excited  by  a  small  dipole  over  1,000  time  iterations  versus  crmax/cropt  ((7.67)). 
Polynomial  spatial  scaling  with  w  =  4  and  k  =  1. 
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Fig.  10  Contour  plot  of  the  maximum  relative  error  at  points  A  and  B,  respectively,  due  to  a 
polynomial  scaled  10  cell  thick  PML  termination  of  a  40  X  40  cell  lattice  excited  by  a 
small  dipole  over  1,000  time  iterations  versus  cmaxloopt  and  m. 
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Fig.  1 1  Maximum  relative  error  at  points  A  and  B,  respectively,  due  to  a  geometrically  scaled 
10  cell  thick  PML  termination  of  a  40  X  40  cell  lattice  excited  by  a  small  dipole  over 
1,000  time  iterations  versus  g  and  ln(^(0)). 
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Fig.  12  |5l  1|  of  a  microstrip  fed  patch  antenna  (superimposed)  printed  on  a  31.25  mil  Duroid 

substrate(er  =  2.2)  computed  via  the  FDTD  method.  The  FDTD  lattice  is  terminated 
by  a  10-ceIl  thick  uniaxial  PML  layer  which  is  placed:  — **—  3  cells  from  the  edge  of 
the  patch  and  5  cells  above  the  patch,  and  — b—  10  cells  from  the  edges  of  the  patch. 
Source:  Gedney,  IEEE  Transactions  on  Antennas  and  Propagation,  vol.  44,  pp.  1630- 
1639,  December  1996. 
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V  Parallel  Finite-Element  Frequency-Domain  Method 

1.  Introduction 

The  finite  element  method  (FEM)  is  an  effective  means  for  analyzing  a  plethora  of 
electromagnetic  problems.  The  FEM’s  principal  attribute  is  that  it  efficiently  models  highly 
irregular  geometries  as  well  as  penetrable  and  inhomogeneous  material  media.  Secondly,  the  linear 
system  of  equations  that  results  from  a  FEM  discretization  is  highly  sparse  and  can  be  solved  using 
efficient  solution  techniques  for  sparse  matrices.  The  focus  of  this  section  is  on  the  development 
of  an  efficient  parallel  algorithm  for  the  solution  of  the  sparse  matrix  arising  from  an  FEM 
formulation.  .  An  early  approach  to  this  problem  was  a  divide-and-conquer  technique  developed  by 
Patterson  et  al.  [38,  39].  This  approach  consisted  of  partitioning  the  global  matrix  using  an 
automatic  partitioning  scheme.  Subsequently,  a  global  iterative  solver  based  on  the  Bi  Conjugate 
Gradiant  (BiCG)  method  was  used  to  solve  the  distributed  sparse  matrix. 

Alternatively,  a  parallel  direct  solution  method  for  two-dimensional  FEM  analysis  was 
introduced  by  Lee  and  Chupongstimun  [40].  This  technique  coupled  the  subdomain  solutions  by 
enforcing  tangential  field  continuity  between  adjacent  subdomains  leading  to  a  global  matrix 
representing  only  the  tangential  fields  on  the  shared  boundaries.  The  global  matrix  is  much  smaller 
than  the  original  FEM  matrix  and  can  be  solved  using  a  direct  method. 

In  [41],  Depres  introduced  a  hybrid  iterative  DMM  for  the  two-dimensional  Helmholtz  problem. 
To  this  end,  an  iterative  method  was  proposed,  for  which  each  iteration  consists  of  solving  the 
fields  interior  to  each  subdomain  and  then  constraining  the  field  continuity  at  the  interface  of  each 
subdomain  by  enforcing  a  Robin-type  transmission  condition  on  the  boundary  fields  (this 
transmission  condition  essentially  enforces  the  continuity  of  both  the  tangential  electric  and 
magnetic  field  intensities  across  the  shared  boundaries).  Depres  also  introduced  a  relaxation 
scheme  in  [42]  which  greatly  accelerated  the  iterative  process.  Later  Stupfel  [43]  extended  this 
method  by  prescribing  a  new  ABC  [44]  at  the  exterior  boundary  and  using  an  onion  like  partition 
of  the  computational  domain  improving  the  efficiency  of  the  transmission  condition  and  overall 
performance  of  this  DDM. 

The  focus  of  this  research  is  on  the  application  of  a  hybrid  iterative  solution  based  on  the  method 
of  Lagrange  multipliers.  This  method  is  modeled  after  Finite  Element  and  Tearing  Interconnecting 
(FETI)  method  originally  developed  by  Farhat  and  Roux  [45].  Specifically,  the  FEM  discretization 
of  the  weak  form  equation  for  each  subdomain  will  be  posed.  The  solutions  of  each  subdomain 
will  be  constrained  through  the  use  of  Lagrange  multipliers  by  enforcing  the  continuity  of  the 
tangential  fields  across  each  boundary  interface.  A  reduced  system  of  equations  representing  the 
Lagrange  multipliers  is  then  derived  and  is  solved  using  a  Conjugate  Gradiant  (CG)  algorithm. 
The  advantage  of  this  method  is  that  each  subregion  can  be  solved  completely  independently. 
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lending  to  a  scalable  algorithm.  Secondly,  the  number  of  iterations  required  to  solve  the  global 
problem  is  dependent  on  the  order  of  the  matrix  representing  the  Lagrange  multipliers  as  opposed 
to  the  global  matrix. 


2.  Formulation 

Assume  that  a  lossy  inhomogeneous  half  space  is  interfaced  with  a  uniaxial  anisotropic  medium 
in  the  z  =  0  plane.  In  the  anisotropic  medium.  Maxwell’s  equations  are  described  in  the  frequency 
domain  as 

VxE  =  -jcoHolir^H ^  VxH  =  jco£f,s^E  (45) 

where  is  complex  and  frequency  dependent,  and  s  is  the  tensor  in  equation  (19). 

The  vector  wave  equation  in  the  uniaxial  medium  is  then  derived  from  (45): 


Vxiu^.s  y  X  E -(O-fioSoS,.  sE  =0  (46) 

where  UPML  is  assumed  throughout  the  volume.  Performing  the  inner  product  with  a  testing 
function  defined  over  the  finite  volume  Q,  and  utilizing  Green’s  first  identity  results  in  the  weak 


form  equation: 


JJJ  Vxf-^,.5  y  xE-w-UQS^sj-sE  f-nx/u^s  V  x  = 

Q  L  -*  r  n  ^ 


(47) 


The  finite  element  solution  is  performed  by  discretizing  the  volume  into  element  domains,  and 
expanding  the  testing  and  trial  vector  functions  using  vector  edge  elements  (in  this  paper  first-order 
Whitney  elements  are  employed).  Then,  the  first  variation  of  (47)  is  evaluated  at  the  stationary 
point,  leading  to  a  sparse  linear  system  of  equations. 

Inside  the  working  volume,  the  medium  is  assumed  to  be  isotropic.  Specifically  from  (19)  and 
(20),  s  reduces  to  the  identity  tensor.  Within  the  PML  region,  5  is  anisotropic,  and  the 
parameters  are  assumed  to  be  spatially  dependent  as  a  w-th  order  polynomial  along  the  normal 
axes.  To  more  accurately  represent  the  spatial  variation,  numerical  integration  was  used  to  perform 
the  integrals  in  (47)  using  numerical  quadrature.  Furthermore,  since  the  PML  interfaces  are 
assumed  planar,  the  implementation  of  (47)  is  quite  simple  in  the  FEM  routine  and  no  special 
preprocessing  is  required.  The  exterior  of  the  PML  region  is  assumed  to  be  a  PEC  wall.  On  this 
wall,  a  Dirichlet  boundary  condition  is  enforced. 


3.  Finite  Element  Tearing  and  Interconnecting  Method 


The  Finite  Element  Tearing  and  Interconnecting  Method  (FETI)  is  a  domain  decomposition 
technique  based  on  a  hybrid  variational  principle.  For  simplicity  consider  a  domain  Q  divided 
into  non-overlapping  subdomains  Q/.  Adjacent  subdomains  will  share  common  boundaries 
defined  by  F/.y.  The  vector  electric  fields  and  testing  functions  within  each  subdomain  are 
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discretized  separately  into  finite  elements.  Then  evaluating  the  first  variation  of  (47)  at  a  stationary 
point  yields: 

Kiei=fi  (48) 

where  e/  is  the  vector  field  unknowns  in  region  Q,  (/  =  1,2) ,  Ki  is  the  stiffness  matrix,  and//  is  the 
forcing  vector  in /Q,. . 

The  discrete  fields  in  each  subregion  are  constrained  to  enforce  the  continuity  of  the  tangential 
electric  fields  across  the  shared  boundary.  Specifically,  let  B  enforce  the  continuity  of  the  discrete 
tangential  fields  across  the  shared  boundaries  Ty.  Subsequently,  using  the  method  of  Lagrange 
multipliers,  it  is  desired  to  solve  the  linear  system  of  equations 

Ke^f  (49) 

under  the  constraint 

Be  =  0  (50) 

where  is  a  block  diagonal  matrix  defined  by 

■/:,  0  •••  0  ■ 

0  K2  0 

_0  •••  0  (51) 

e  and /are  described  as 

r-’ii  r/ii 


L/a/J  (52) 

pd  B  is  also  a  block  matrix  representing  n  xYe,  -  ej)  =  0 .  It  is  noted  that  B  is  highly  sparse,  and 
all  non-zero  entries  are  simply  il.  From  the  theory  of  Lagrange  multipliers,  the  solution  of  this 
constrained  linear  system  is  equivalent  to  finding  the  stationary  point  of 

f  =  -e'^Ke-e^f+X^Be  (53) 

2 

where  A  is  the  vector  of  Lagrange  multipliers.  Taking  the  first  variation  of  (53)  and  evaluating  it  at 

the  stationary  point  leads  to  the  symmetric  linear  system  of  equations 


K  B‘ 
B  0 


nr^i=r/‘ 

)1aJ  Lo_ 


From  the  first  row  of  (54),  it  follow^  that 

e  =  K~^[f-B^x)  (55) 

combining  this  with  the  second  row  of  (54)  leads  to  the  symmetric  sparse  linear  system  of 
equations 

BK'^B^X  =  BK~^f 


(56) 
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The  parallel  algorithm  then  proceeds  as  follows;  i)  K  is  factorized  using  a  sparse  matrix 
factorization  method.  Note  that  this  is  done  completely  in  parallel  due  to  the  block  diagonal 
characteristics  of  /:  defined  in  (51).  As  a  result,  each  Ki  is  factorized  independently  and 
concurrently  on  each  processor.  //)  A  complex  CG  algorithm  is  used  to  compute  the  solution  for 
the  Lagrangian  vector  A  from  the  reduced  order  matrix  equation  in  (56).  It  is  noted  that  the  matrix 
vector  multiplies'  are  performed  completely  in  parallel  since  the  products  of  vectors  with  the  matrix 
blocks  Kf‘  and  5/  can  be  performed  explicitly  in  parallel.  Hi)  Once  the  Lagrange  multipliers  are 
computed,  the  electric  fields  are  computed  in  parallel  from  (55).  It  is  noted  that  the  matrix  in  (56) 
is  well  conditioned,  and  the  orddr  of  the  matrix  is  greatly  reduced  as  compared  to  the  order  of  the 
global  matrix.  As  a  result,  the  iterative  scheme  is  expected  to  converge  rapidly.  It  is  also  noted 
that  if  there  is  symmetry  or  repetitiveness  in  the  geometry  resulting  in  blocks  with  identical  AT/,  then 
such  blocks  only  need  to  be  stored  and  factored  once  on  one  processor.  (It  is  noted  that  proper 
load  balance  should  still  monitored  such  instances). 


4.  Results 

Initially,  the  effectiveness  of  the  UPML  absorbing  media  will  be  presented  for  the  frequency- 
dependent  analysis.  Then,  the  efficiency  of  the  FETI  solution  will  be  studied  for  FEM  models 
with  UPML.  To  study  the  effectiveness  of  the  FETI  for  solving  PML  meshes,  an  air  filled 
rectangular  waveguide  was  initially  studied.  This  problem  was  chosen  since  the  exact  solution  is 
known.  Secondly,  it  isolates  the  performance  of  a  single  wall  in  the  discrete  space.  The 
waveguide  is  assumed  to  have  a  cross  section  of  a=  0.4m  by  b  =0.4m.  The  TEQ^  mode  is 

excited  at  one  end  of  the  waveguide,  and  the  other  end  is  terminated  into  a  UPML  layer.  The 
waveguide  was  excited  at  400  MHz  which  is  above  the  TEo^  mode  cutoff  frequency  of  375  MHz. 

Initially,  the  waveguide  length  was  set  to  1.1m  and  the  depth  of  the  PML  was  0.4  m  which  is 
effectively  8  cell  radii  thick.  First-order  tetrahedral  edge  elements  or  Whitney  elements  were  used 
in  this  study,  and  there  was  no  effort  to  make  the  tetrahedral  elements  syinmetric  about  any  axis. 
We  looked  at  the  effect  of  varying  the  order  of  the  spatial  polynomial  from  w  =  0  to  w  =  3 .  In 
addition  k,  was  set  to  1  for  the  propagating  mode,  and  effectively  was  varied  by 

.  ■  '  ^  ^  l' 

correspondingly  varying  the  theoretical  reflection  coefficient  A,,,,  which  is  defined  as  [26]: 


_2^jm.pid^(m+\) 

D  —a  “>^o 

where  m  is  the  order  of  the  spatial  polynomial 
Subsequently,  is  given  by: 


(57) 

and  d  is  the  thickness  of  the  PML  layer  in  meters. 


(58) 
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The  reflection  coefficient  computed  numerically  is  illustrated  in  Fig.  13.  Superimposed  in  tliis 
graph  is  the  theoretical  prediction  of  Rth-  The  theoretically  predicted  reflection  coefficient  is 
followed  up  to  an  optimal  value.  The  case  when  m  =  3  performed  optimally,  and  provides  -60  dB 
to  -70  dB  of  absorption  near  its  optimal  value  near  ln(i?//,)  =  -6.6.  Choosing  above  this  value 
leads  to  a  degradation  in  the  performance  of  the  PML. 

The  parallel  FETI  solution  method  was  implemented  on  a  32  processor  Intel  iPSC/860 
hypercube.  For  scalability  analysis,  the  physical  length  of  the  waveguide  was  varied  from  0.8  m 
to  12.8  m  doubling  the  length  at  each  step  and  simultaneously  doubling  the  number  of  processors. 
(Note  that  the  PML  depth  was  kept  constant  (d  =  0A  m)).  At  each  step  the  number  of  processors 
used  was  also  doubled  keeping  the  number  of  elements  per  processor  constant.  This  provided  for 
2i. scaled  speedup  analysis.  Scaled  speedup  provides  a  measure  of  the  scalability  of  the  algorithm 
by  scaling  the  problem  size  with  the  number  of  processors.  Scaled  speedup  can  be  determined  by 
summing  the  efficiencies  of  each  processor.  This  method  essentially  determines  the  amount  of  time 
it  would  take  a  single  processor  to  solve  the  same  problem  as  the  multi-processor  system.  Finally, 
the  order  of  the  spatial  variation  was  set  to  /w=0  and  the  conductivity  was  set  by 
choosing  i?,;,=1.367e-05. 

Table  3  summarizes  the  results  from  varying  the  number  of  processors  from  two  to  thirty-two 
with  each  processor  responsible  for  3072  elements.  The  discretization  used  for  this  problem  was  a 
simple  one-way  dissection.  As  a  result,  for  each  case  there  are  (P-1)  common  boundaries  with  176 
common  edges  per  boundary  where  P  is  the  number  of  processors  used.  The  waveguide  problem 
solution  was  found  by  solving  the  local  problem  using  a  direct  solver  while  solving  the  global 
problem  using  an  iterative  solver  (PCGM).  The  local  problem  could  have  also  been  found  using  a 

iterative  solver  but  for  this  study  a  direct  solver  was  chosen. 

To  compare  with  the  FETI  method.  Table  4  shows  the  results  from  solving  the  same  problem 
using  a  single  node  on  a  SGI  Power  Challenge  using  a  PCG  method  with  diagonal 
preconditioning.  It  is  interesting  to  note  that  the  number  of  iterations  has  dramatically  increased. 
This  is  due  to  the  fact  that  the  FEM  matrix  becomes  highly  ill-conditioned  with  PML  present. 
Interestingly,  the  reduced  matrix  resulting  from  the  FETI  solution  is  very  well  conditioned,  as 
illustrated  in  Table  3. 
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Fig.  13.  Fitted  reflection  error  for  a  0.4m  x  0.4m  x  1.1m  waveguide  with  a  0.4m  PML  for  different  values  of 
the  polynomial  variation,  m,  and  within  the  PML. 


Table  3.  Comparison  of  results  for  the  FETI  algorithm  with  3072  cells  per 

Processor 


Processors 

2 

4 

8 

16 

32 

Total  number  of  Unknowns 

6240 

12656 

25488" 

51152 

102480 

#  of  Lagrangian  Multipliers 

176 

528 

1232 

2640 

5456 

Iterations 

57 

133 

187 

261 

385 

Scaled  Speedup 

1.27 

2.21 

4.99 

24.2 

Execution  Time  (sec) 

456 

{ 

Table  4.  Results  for  waveguide  problem  on  a  Single  Processor  using  Diagonal 

Preconditioning. 


Total  number  of  Unknowns 

6240 

12656 

25488" 

51152 

102480 

Iterations 

4212 

6974 

12028 

22063 

6 1 642 

Execution  Time  (sec) 

107 

394 

1604 

6733 

45719 

VI  A  Parallel  Unconditionally  Stable  Finite-Element  Time-Domain 
Algorithm 

The  previous  section  presented  a  parallel  finite-element  frequency-domain  algorithm.  The  focus 
of  this  section  is  the  development  of  a  parallel  finite-element  time-domain  algorithm.  Finite  element 
time-domain  based  solutions  of  Maxwell's  equations  have  been  developed  using  both  implicit  [46]^ 
and  explicit  methods  [47].  For  an  implicit  method  to  be  competitive  with  an  explicit  method,  the 
number  of  time  iterations  required  to  converge  to  a  final  solution  must  be  significantly  less,  since 
each  time  iteration  requires  a  solution  of  a  linear  system  of  equations.  Unfortunately,  the  implicit 
finite  element  time-domain  (FETD)  methods  presented  in  [46]  are  conditionally  stable,  and  in  fact, 
stability  can  lead  to  time  steps  that  are  smaller  than  that  required  by  explicit  FDTD  methods  [9]. 
'  Therefore,  it  becomes  necessary  to  develop  a  technique  that  is  unconditional  stable.  The  time- 
dependent  vector  wave  equation  for  a  lossy,  inhomogeneous  space  is  written  as 

jU;.  ct  ct  at 

A  variational  expression  is  then  derived  from  this  equation.  Then,  expanding  the  field  using  vector 
finite  elements  leads  to  a  linear  system  of  equations  with  time  dependent  coefficients  e 

A  difference  equation  is  derived  using  the  Newmark  Beta  formulation,  leading  to  the  implicit 
formulation  for  the  electric  field 

-  [r,]-}l7„c«A([7-„]  +  /3(c„A/)2[5]]e"-'  -(c„A()^[/r*'  +(1-2/?)/”*'  +^/”‘']}(61) 


where  P  is  the  Newmark  Beta  coefficient.  Performing  a  stability  analysis,  it  is  found  that  if 
P  >  0.25,  this  implicit  formulation  is  unconditionally  stable  [6].  It  is  further  found  that  the 
optimal  choice  for  p  is  0.25,  as  this  leads  to  minimal  error.  This  is  intuitive  since  if  p  =  0.25, 
this  formulation  is  equivalent  to  that  derived  using  a  trapezoidal  time-integration  scheme.  The 
principal  advantage  of  unconditional  stability  is  that  the  time  step  is  no  longer  governed  by  the 
mesh  quality  and  the  resulting  eigenspectrum  of  the  linear  system,  but  rather  by  the  spectral  content 
of  the  signal  propagating  through  the  mesh.  For  printed  circuit  applications,  this  has  a  great 
implication  as  cell  sizes  can  easily  be  on  the  order  of  0.001  Xmin  or  smaller. 

A  parallel  algorithm  was  implemented  to  perform  the  implicit  time-dependent  solution  based  on 
the  FETI  algorithm  described  in  the  previous  section.  As  expected,  this  has  provided  excellent 
speedups.  As  a  demonstration  of  the  effectiveness  of  the  algorithm,  we  consider  the  problem  of 
computing  the  resonant  frequencies  of  a  cavity  loaded  with  a  PEC  wedge,  as  illustrated  in  Fig.  14. 
The  cavity  has  the  dimensions  1.0  m  by  1.0  m  by  1.5  m.  The  wedge  has  a  base  area  of  0.2  m 


Fig.  14  A  PEC  Wedge  in  a  cavity  Fig.  15  Tetrahedral  mesh  of  the  surface  of  the  wedge  and 

cavity 


0.000  0.100  0,200  0..^00  0,400  fl..‘50t)  0.4mt  0.500  (i.tMin  ii.XKi  . . . 

Tini»Slrp(n!i)  Tlmr  Stcp(ns)  ,  Time  Step  ( nsf 

(a)  (b)  (c) 

Fig.  16  Variation  of  resonant  frequency  v/s  the  time  step. 

by  0.2  m  and  a  height  of  0.75  m.  This  problem  was  modeled  using  SDRC/I-DEAS  using 
tetrahedral  elements.  The  maximum  element  size  is  0. 1  m  and  the  minimum  size  is  0.005  m. 
Near  the  apex  of  the  wedge,  the  element  sizes  are  very  small  so  that  the  fine  geometries  in  this 
region  can  be  accurately  measured.  The  surface  mesh  on  part  of  the  geometry  is  shown  in  Fig.  15. 

Figures  I6.(a)-(c)  illustrate  the  variation  in  the  resonant  frequencies  as  the  time  step  is  increased. 
We  notice  that  the  error  in  the  resonant  frequency  is  less  than  1 .0  %  for  all  values  of  the  time  step, 
except  for  the  last  data  point.  A  central  difference  algorithm  has  a  limit  on  the  time  step  of 
0.0075  ns  because  of  stability  considerations. 

Figure  17  illustrates  the  variation  of  the  total  execution  time  for  the  time  simulation  as  the  time 
step  is  increased.  It  is  noted  that  the  execution  time  decreases  as  the  time  step  is  increased.  This 
decrease  in  the  execution  time  is  very  sharp  for  the  smaller  time  steps,  but  as  the  time  step  is  made 
larger,  this  benefit  reduces.  Figure  17  also  illustrates  the  variation  in  the  number  of  iterations  per 
time  step  for  the  Conjugate  Gradient  algorithm  versus  the  time  step.  We  see  that  as  the  time  step  is 
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■♦“Fxecution  Time  fs)  Complete  Simulation 
•  Averaue  No.  of  C.G.  iterations 


Time  Step  (ns) 

Fig.  1 7  Variation  of  Execution  time  and  Average  number  of  CG  iterations  per  time 
step  with  the  time  step. 

) 

increased,  the  CG  algorithih  requires  approximately  the  same  number  of  iterations  for  convergence 
for  small  time  steps,  but  this  number  grows  almost  in  direct  proportion  to  the  time  step.  Thus,  an 
appropriate  choice  for  selecting  the  simulation  time  step  can  be  made  by  considering  both  Figs.  16 
and  17  to  give  reasonable  accuracy  and  fast  execution.  For  this  problem,  an  appropriate  choice  for 
the  time  step  would  be  approximately  0.1  ns.  Note  that  this  time  step  is  more  than  an  order  of 
magnitude  larger  than  the  central  difference  algorithm  stability  limit  and  can  reduce  execution  time 
for  the  complete  simulation  by  that  factor. 
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VII  Orthogonal  and  Nonorthogonal  FDTD  Analysis  of  Periodic 
Structures 

1 .  Introduction 

Frequently,  it  is  of  interest  to  analyze  the  electromagnetic  scattering  by  bodies  with  geometries 
with  periodicity  in  one  or  more  dimension.  Sortie  examples  of  applications  of  periodic  structures 
are  frequency  selective  surfaces,  active  grid  arrays,  or  photonic  band  gap  structures.  Taking 
advantage  of  this  periodicity  can  lead  to  greater  efficiency  and  accuracy  when  solving  the  problem 
numerically. 

Typically  each  periodic  feature  is  referred  to  as  a  cell  and  the  periodicity  of  these  cells  is 
accounted  for  using  Floquet  theory.  For  a  normally  incident  plane  wave,  accounting  for  this 
periodicity  in  either  an  orthogonal  or  non-orthogonal  FDTD  method  is  quite  straightforward  as 
there  is  no  phase  shift  between  each  periodic  cell  [48].  However,  when  a  plane  wave  source  is 
obliquely  incident  there  is  a  cell-to-cell  phase  variation  between  corresponding  points  in  different 
unit  cells  which  causes  the  time-domain  implementation  to  become  more  difficult. 

For  oblique  incidence  plane  waves,  a  Floquet  field  mapping  may  be  applied  which  results  in  a 
set  of  mapped  fields  which  possess  the  same  cell-to-cell  field  relations  as  exist  for  the  normally 
incident,  unmapped  fields  [49].  The  resulting  equations  may  add  considerable  complexity  to  the 
FDTD  solution  [50]  an^  lead  to  a  more  stringent  stability  relation  for  higher  angles  of  incidence. 
At  present,  the  periodic  methods  which  are  available  have  only  been  applied  using  the  orthogonal 
FDTD  method. 

In  this  work,  the  Floquet-mapped  periodic  FDTD  equations  are  solved  using  an  alternative 
approach  referred  to  as  the  split-field  update  method.  This  technique  is  shown  to  be  simple  to 
implement  and  a  stability  analysis  shows  the  technique  to  have  a  less  strict  stability  criterion  than 
previous  implementations.  Subsequently,  the  split-field  update  method  is  applied  in  a  general 
curvilinear  space  using  the  non-orthogonal  FDTD  technique.  The  use  of  Floquet-mapped  FDTD  in 
non-orthogonal  grids  may  lead  to  further  computational  savings  due  to  fewer  and  larger  cells.  The 
split-field  update  technique  is  validated  by  comparison  of  the  numerical  results  with  measured  data. 
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2.  Formulation  - 

The  field  components  of  a  TMz  plane  wave  incident  on  a  two  dimensional  material  which  is 
periodic  in  the  x  direction  will  have  a  phase  shift  of  the  form  where  is  the 


speed  of  light  in  free  space,  and  6  is  the  angle  of  propagation  for  the  incident  field.  A  set  of 
auxiliary  variables  is  introduced  which  implicitly  accounts  for  this  phase  shift  as  [49] 


(62) 
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Substituting  these  expressions  into  Maxwell's  curl  equations  maps  the  solution  space  such  that 
there  is  no  phase  shift  between  the  computed  fields  at  like  positions  in  each  periodic  cell.  Now 
boundary  conditions  at  the  periodic  boundaries  consist  of  constraining  the  tangential  fields  on 
opposing  boundary  walls  to  be  equivalent.  The  application  of  this  mapping  in  orthogonal  is  first 
presented  and  then  for  non-orthogonal  grids. 

J.  Orthogonal  Grid  FDTD 

Substituting  (62)  into  Maxwell’s  curl  equations  leads  to; 

irJJnln  (63) 


'  V,, 

(65) 

Vo  dx  Vo 

Notice  that  the  substitution  of  (62)  has  produced  extra  terms  (denoted  by  brackets)  on  the  right 
hand  side  of  (63)  and  (65).  When  (63)  and  (65)  are  discretized,  the  presence  of  these  extra  terms 
leads  to  difficulty.  One  difficulty  arises  due  to  the  appearance  of  the  time  derivative  {j(o  =>  ^)  on 

both  sides  of  these  equations.  Another  difficulty  arises  because  the  right  hand  sides  are  no  longer 
spatially  aligned.  These  difficulties  may  be  overcome  by  introducing  dual  grids  in  time  and  multiple 
grids  in  space  [50]. 

In  this  work  an  alternate  approach  is  taken  which  is  numerically  stable,  efficient,  and  applicable 
in  either  orthogonal  or  non-orthogonal  grids.  The  technique  used  here  will  be  referred  to  as  the 
split-field  update  method.  Equation  (63)  is  now  'split'  into  two  parts  by  defining  P.  =P^  +P. 

where 

=  '  (66) 

(67) 

Equation  (43)  is  split  similarly  by  defining  Qy  =  Qy  +  Qy  where 

=  ^  (68) 


nb-EUlp 


Substituting  the  split  forms  of  Qy  along  with  (69)  into  (67)  and  solving  for  Pz  then  gives 
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e;. 

*  1  sin^  6  c  —  SwlS.  ■ 

'■  ''  l^r 

Equations  (64),  (66),  and  (68)  and  (70)  are  discretized  using  a  spatially  interleaved  Yee  lattice. 
Furthermore,  a  dual  time  grid  is  introduced  such  that  each  field  component  of  both  P  and  Q  is 
computed  at  each  halftime  step.  Note  that  the  time  derivatives  are  applied  using  central  differencing 
in  time  as  usual,  but  now  these  updates  are  computed  at  each  half  time  step.  Spatial  alignment 
required  in  (70)  is  accomplished  by  field  averaging  which  provides  second  order  accurate  results 

without  resorting  to  multiple  spatial  grids.  The  resultant  update  equations  are 


PH  = 


1  _  sin"  0 


where 


sin9lp„ 

Pr  ^ 

\  . 

At 

_  At 

A  *  ^ 

Sf./Sx 

sAy 

At 

pAy 


Implementation  of  (71)  through  (75)  has  proven  that  this  technique  is  stable  for  all  angles  under  90 
degrees. 

An  exact  stability  relationship  was  derived  for  these  update  equations  [51]: 

A,<-!— - ,  ^ - ?=  w 


Vo  l5m  e\sin  ^cos^  +  ^sin^  9  si  A  ^  cos'  ^  +  (sin^  ^  +  a)cos^  9 


where 


_l  1  1 1  +  sin^  9(4a  +  2)  -'I sir?  9(a - sirT  9)  +  1 

2  y  sirP"  9(2  +  3a)  +  a 

V  / 

and  u  =  ("Ac /Ay/.  It  is  noted  that  this  stability  criterion  is  must  less  stringent  than  that  of  the 
algorithm  in  [49]  at  higher  angles  of  incidence.  In  fact,  near  grazing  the  time  step  can  be  increased 
by  as  much  as  a  factor  of  3. 
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Next,  the  split-field  update  method  is  applied  to  a  two-dimensional  non-orthogonal  grid.  To  this 
end,  it  is  assumed  that  the  grid  is  structured,  but  irregular  and  non-orthogonal.  Each  cell  is  locally 
defined  by  a  curvilinear  coordinate  axis,  defined  as  (m/,  uj,  us).  The  axis  is  defined  on  a  local 
basis  by  the  unitary  vectors  d^,  dj,  d^,  where  d^  is  orthogonal  to  the  «/,  «2  plane.  In  the  general 

curvilinear  space,  the  covariant  field  values  are  tangential  to  the  unitary  vectors  and  are  designated 
by  subscripts.  The  contravariant  field  vectors  are  normal  to  the  cell  faces  whose  edges  are  defined 
by  the  unitary  vectors  and  are  designated  by  superscripts.  In  the  orthogonal  space,  a  mapping  of 
the  form  in  (40)  was  assumed.  This  mapping  is  projected  into  the  general  curvilinear  space  as; 

Ho  t 

^  ^^^~J(P\^^\'^P2^^2)  (78) 

where,  Pi  =  k.,  ■  ai.^.  Pi  =  Kx '  «2.v’  and  a/.v  and  ajx  represent  the  x-directed  component  of  the 
normalized  unitary  vectors  dy  and  d^,  respectively.  An  equivalent  expression  to  (78)  is  also  used 
for  the  contravariant  field  components. 

The  mapped  fields  in  (78)  are  then  inserted  into  the  differential  form  of  Maxwell’s  equations  in 

general  curvilinear  coordinates  [52].  For  the  TMz  case,  this  leads  to: 

,,  1  a  (791 


1  ( 80')  dQ\  \  ,  sin  9  ^  .  sin  0  ^ 

-r  — f-ay_xQ2-J(o-^a2,Qi, 

1  dPT.  .  Sind  _ 

■ - r~  ^2x^2’ 

■Jg  dU2  V^sjg 


.  PrQ 

J(0^^ 


\  dP-y  .  Sind  o 
Vo  V^xjg 


Note  that  the  substitution  of  (78)  has  resulted  in  extra  terms  (denoted  by  brackets)  on  the  right- 
hand-side  of  (79)-(81).  Once  again,  these  terms  are  handled  efficiently  via  a  field  splitting. 

Specifically,  let  and  where 

.  ^  1  _  sine  „ 


.  PrQ 


11 

'ItS 

,  „2b  'Sind 

dU2^ 

r«i 

1  PP3 
■yjg  dU2 

QXh 

sin  d  „ 

=  ^2x^3’ 

Pr-ig 

_  1  dP^ 

sin  d  r> 

z  n. 

Vg  ’ 

[  3* 

Pr-Jg 

£r-^g 


^U2xQ], 
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Finally,  each  covariant  component  is  projected  from  the  contravariant  field  components  through  the 


mapping 


j-i 

where  gy  =  dj  •  dj  and  ^Ig  =  d2  {di  x  ^2). 

To  solve  the  split  field  equations  in  (82)  and  (83),  through  the  manipulation  of  terms,  it  can  be 


shown  that 


sin  9  '  sin  9  ^ 

P  I —  ^IxQla  I  ^lxQ\i 

r  — - 7  \ 


\-sin9  ^A/,-%M2 
>.yS  yS 


where 


^ixS\  1  ^  a\xSi\^i>^^ 


,  M,= 


^2xS\2^dt9  di^.g22^^in9 

SrPr^jg  SrPi^g 


‘  £rP>4g  ^rPr4g  ’  “  ^rPrig 

and  hence  eliminating  and 

Finally,  (82)  through  (87)  can  be  implemented  in  a  discrete  space  in  a  similar  manner  as  that 
presented  for  the  orthogonal  grid  algorithm.  Specifically,  both  P  and  Q  are  updated  at  every  half 
time  step.  Then,  and  are  updated  using  (82)  -  (87).  It  is  noted  that  P3  =  P^  due  to 

the  assumption  of  a  two-dimensional  geometry. 


5.  UPML  Boundary  Condition 

In  both  the  orthogonal  and  the  non-orthogonal  grid  codes,  the  uniaxial  PML  method  was 
successfully  employed  to  terminate  the  non-periodic  boundaries.  This  was  based  on  a  simple 
extension  of  the  work  in  [14,  51]. 


6.  Validation 

To  validate  both  the  non-orthogonal  and  orthogonal  formulations,  the  algorithms  were  applied  to 
the  analysis  of  a  photonic  bandgap  (PBG)  stmcture.  The  results  were  then  compared  to  pleasured 
results  provided  by  Georgia  Tech  Research  Institute,  in  Atlanta,  GA.  The  geometry  under  study  is 
illustrated  in  Figure  18.  Each  unit  cell  consists  of  four  infinitely  long  dielectric  rods  with  a  radius 
of  a  =  2mm  and  a  dielectric  constant  of  s,-  =  4.2.  The  rods  are  arranged  in  a  square  lattice  such  that 
the  center-to-center  separation  distance  is  equal  to  the  unit  cell  width  where  b  =  9.0  mm. 
Simulations  were  run  at  angles  of  incidence  of  0  to  50  degrees  were  experimental  data 
(transmission  coefficient)  was  provided. 

This  structure  was  analyzed  using  both  the  orthogonal  and  non-orthogonal  implementations 
formulated  in  this  work.  The  orthogonal  FDTD  code  is  based  on  (49)  through  (53)  with  each 
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Fig.  18  Cross-section  of  a  slab  of  two-dimensional  photonic  band-gap  material. 

dielectric  rod  discretized  at  16  cells  (A,v  —  Av  —  0.25  tnm)  across  the  diameter  of  rod.  The  grid 
dimensions  were  39  x  229  including  a  10  cell  PML  region  at  the  two  ^’-directed  walls.  Test  were 
run  which  showed  that  the  numerical  solution  had  converged  at  this  discretization.  In  the  non- 
orthogonal  FDTD  based  on  (60)  through  (65)  only  six  cells  across  the  diameter  of  the  dielectric 
rods  were  used  and  the  grid  dimensions  in  were  19  x  105  cells  including  a  10  cell  PML  region.  For 
each  case,  the  PML  parameters  were  optimally  chosen  based  on  [14,  5 1], 

Figures  19  illustrate  the  level  of  agreement  between  the  two  numerical  methods  as  well  as  with  the 
measured  data  for  a  TMz  incident  plane  wave  incident  at  20^  off  normal.  The  level  of  agreement  is 
quite  good  and  a  clear  bandgap  of  at  least  10  dB  is  indicated.  The  error  is  very  sfnall  for  either  case 
and  probably  within  the  margin  of  measurable  accuracy.  The  orthogonal  implementation  of  this 
method  was  stable  for  all  angles  of  incidence  less  than  90°.  The  non-orthogonal  method  was  stable 
for  angles  of  incidence  less  than  60°.  For  angles  greater  than  60°  the  method  did  go  unstable  in  the 
very  late  time. 


40 


S.  Gedney,  Rigorous  Analysis  of  Large  Scale  MMIC  Circuit  Devices 


4/98 


Fig.  19  Transmission  coefficient  of  four  rod  deep  photonic  bandgap  structure  for  a  TMz  plane 
wave  incident  at  20  degrees,  (a)  magnitude,  (b)  phase.  Periodic  cell  is  9mm  wide  and 
rod  diameter  is  4  mm,  and  sr  =  4.2. 
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VIII  Incorporating  the  State  Variable  Technique  into  the  FDTD  and 
PGY  Methods  for  Modeling  Linear  and  Nonlinear  Devices 

1.  Introduction 

The  formulations  for  modeling  lumped  elements  using  the  FDTD  method  have  been  developed 
for  several  years  [53],  [54]  and  the  numerical  results  computed  by  the  FDTD  method  are  in 
excellent  agreement  with  those  computed  by  other  CAD  tools  such  as  SPICE.  Research  for 
connecting  the  SPICE  with  the  FDTD  analysis  has  also  been  conducted  [55].  To  further  extend  the 
applications  of  the  FDTD  method,  the  state-variable  technique  has  recently  been  incorporated  into 
the  FDTD  algorithm  in  analyzing  the  small  signal  equivalent  circuits  for  microwave  active  devices 
[56,  57]  and  the  nonlinear  large  signal  model  for  GaAs  MESFET  amplifiers  [58].  There  are  two  ^ 
basic  techniques  for  modeling  microwave  active  devices.  The  first  technique  is  to  replace  the  active 
regions  of  the  devices  by  equivalent  current  sources  with  parallel  FDTD  cell  capacitance's.  The 
second  technique  is  to  replace  the  active  regions  of  the  devices  by  equivalent  voltage  sources  with 
series  FDTD  cell  inductance’s.  These  two  techniques  are  referred  as  the  current-source  approach 
[56]  and  the  voltage  source  approach  [57],  respectively.  It  is  important  to  realize  that  the  current- 
source  and  the  voltage-source  approaches  are  dual.  The  former  is  based  on  Ampere  s  law  and  the 
latter  is  based  on  Faraday's  law.  The  current-source  (CS)  approach  has  an  advantage  over  the 
voltage-source  (VS)  approach  in  the  sense  that  the  CS  approach  provides  more  physical  insight  to 
the  problem.  For  instance,  when  modeling  the  active  regions,  such  as  the  gate  port  and  the  drain 
port,  of  a  microwave  active  device,  those  regions  are  usually  divided  into  several  FDTD  cells. 
Then  the  net  current  sources  and  the  net  capacitances  at  those  regions  are  just  the  sums  of  the 
individual  current  source  and  capacitance  for  each  cell  respectively  (since  the  sources  and 
capacitances  are  in  parallel).  The  VS  approach  does  not  yield  similar  physical  insight  [57].^  On  the 
other  hand,  the  VS  approach  can  sometimes  simplify  the  state-variable  model  for  the  equivalent 
circuit  of  a  microwave  active  device  by  reducing  the  number  of  the  state  variables  that  are  to  be 
solved  [58].  This  is  the  main  advantage  for  using  the  VS  approach.  Due  to  the  straightforward 
implementation  and  physical  understanding  of  the  CS  approach,  this  paper  will  follow  the 
formulation  presented  in  [56].  The  CS  approach  will  be  briefly  reviewed  in  Section  2. 

There  is  no  doubt  that  the  major  task  for  modeling  a  nonlinear  device  is  to  solve  for  a  set  of 
nonlinear  equations  that  describe  the  characteristics  of  the  device.  One  of  the  most  popular 
algorithms  is  the  Newton-Raphson  algorithm.  It  is  obvious  that  a  set  of  nonlinear  differential 
equations  needs  to  be  used  to  describe  the  nonlinear  state-variable  model  for  the  device.  This  set  of 
differential  equations  can  be  transformed  into  a  set  of  nonlinear  algebraic  equations  by  using  the 
predictor-corrector  algorithm  [59].  The  Newton-Raphson  algorithm  is  then  applied  to  solve  for 
this  set  of  nonlinear  algebraic  equations.  In  order  to  assure  that  the  solutions  for  the  nonlinear 
equations  always  converge  to  correct  values,  a  globally  convergent  method  based  on  the  Newton's 
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method  [60]  is  employed  in  this  paper.  Some  of  the  basic  concepts  for  this  convergent  method  will 
be  discussed  in  Section  III.  To  consolidate  the  validity  for  the  use  of  state-variable  technique  along 
with  the  FDTD  algorithm  in  performing  full-wave  analyses  on  microwave  circuits,  several 
numerical  examples  will  be  presented  in  Section  IV.  Finally,  the  conclusion  is  drawn  in  Section  V. 


2.  The  Cwrent-Source  Approach 

For  the  sake  of  discussion  and  simplicity,  it  is  assumed  that  the  lumped  elements  are  z-directed. 
The  integral  form  for  Ampere's  law  is 

^H-dl=-\\sE-ds+\\j{,E)-ds  (88) 

css 

Assume  a  lumped  device  is  placed  on  an  edge  of  the  primary  grid.  Then  (88)  is  applied  to  the 
secondary  grid  face  through  which  the  edge  passes.  To  represent  this  more  conveniently,  (88)  is 


rewritten  aS 

dV 

where  I  tot  is  the  total  current  given  by  the  line  integral  in  (89)  and  is  performed 


(89) 

about  secondary 


grid  face,  Cf  =  sAI  Ar,  where  A  is  the  cross  area  of  the  face  and  Az  is  the  length  of  the  device 


loaded  edge,  and  Idevi^)  is  the  current  flowing  through  the  device.  The  equivalent  circuit 


representing  (89)  is  illustrated  in  Fig.  20.  Thus,  given  Itou  a  state  variable  approach  will  be  used 
to  solve  (89)  for  the  voltage  V.  Subsequently,  V  will  be  used  to  calculate  the  electric  field  on  the 


edge. 


3.  State-Variable  Solution 

A  state  variable  method  will  be  used  to  solve  for  the  device  current  and  the  port  voltage  as 
illustrated  in  Fig.  20.  The  means  by  which  the  solution  is  derived  will  differ  for  linear  and 
nonlinear  devices.  Initially,  assume  that  the  edge  is  loaded  with  a  linear  device,  such  as  a  lumped 
passive  circuit.  In  such  a  case,  the  state  equation  can  be  generically  stated  as: 

x  =  Ax{t)  +  bu{t)  (^^) 

where  x  is  the  state-variable  vector,  is  a  square  matrix,  6  is  a  vector  associated  with  the  excitation 
u(t).  There  exist  well-known  analytical  solutions  to  (5)  for  canonical  problems  [59].  However, 
(5)  is  solved  numerically  for  sake  of  generality.  Using  the  second-order  finite-difference  scheme, 
(5)  is  written  as 


where  I  is  an  identity  matrix  and  the  superscript  indicates  the  time  step.  Equation  (91)  is  an  implicit 
scheme.  Although,  the  order  of  ^  is  assumed  to  be  small  can  be  solved  using  LU  factorization.  It 
should  be  noted  that  if  there  is  more  than  one  source  then  b  will  become  a  matrix. 
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It  can  be  shown  that  in  some  instances,  directly  incorporating  (91)  into  the  FDTD  algorithm  can 
lead  to  instability.  For  example,  terminating  a  microstrip  line  by  a  linear  inductive  load  with  a  very 
small  inductance  can  lead  to  instability.  To  guarantee  a  stable  solution  for  all  linear  networks,  the 
first-order  backward  scheme  (Euler's  algorithm)  can  be  used.  This  leads  to 

'(/-AM)x"^‘=x"  +  ArZ)M"^'.  (92) 

For  the  nonlinear  case,  the  state  equation  is  written  in  the  form  of  [59] 
x  =  f{x,u).  (93) 

To  solve  the  nonlinear  state  equation,  the  predictor-corrector  algorithm  coupled  with  the  Newton- 
Raphson  algorithm  is  used.  In  particular,  the  first-order  Adams-Bashforth  predictor  and  the  first- 
order  Adams-Moulton  corrector  [59]  are  used: 
predictor: 

.r"^'-(0)=x”+A//’(x",M")  (94.a) 

corrector: 

;c«+‘’(^)=x"+Ar/(x"+'’<^"'\M"^')  (94-b) 

where  the  superscript  in  the  parenthesis  designates  the/^'  iteration  at  the  «+!  time  step.  Notice  that 
(94.a)  is  an  explicit  algorithm  while  (94.b)  is  an  implicit  one.  Note  also  that  the  set  of  differential 
equations  (93)  has  been  transformed  into  a  set  of  algebraic  equations  in  (94.a)  and  (94.b).  Other 
higher  order  predictor-corrector  algorithms  could  also  be  used.  However,  the  absolute  stability 
regions  for  those  higher-order  algorithms  are  very  restrictive  [59].  It  is  found  that  the  use  of  the 
second-order  Adams-Bashforth  predictor  and  the  use  of  the  second-order  Adams-Moulton 
corrector  will  lead  to  late  time  instability  when  they  are  applied  to  solve  the  nonlinear  state 
equations  for  the  large  signal  model  of  a  GaAs  MESFET  amplifier.  It  is  important  to  realize  that 
the  predictor-corrector  algorithm  itself  can  be  used  to  solve  for  a  set  of  nonlinear  differential 
equations  without  the  aide  of  the  Newton-Raphson  algorithm.  However,  a  systematic  strategy 
should  be  used  to  assure  the  convergence  of  the  solutions  to  the  nonlinear  differential  equations.  A 


globally  convergent  method  was  proposed  in  [60]  and  is  used  in  this  paper. 

The  state  equation  (93)  can  be  written  as 

x-f(x,u)  =  0.  (93) 

Applying  the  first-order  Adams-Moulton  algorithm  (94.a)  to  (95)  leads  to 

F(x”+‘,M"+')  =  x"''*-Jc”-At/(jc"^',M"+')  =  0.  (96) 

Expanding  (96)  into  a  Taylor's  series  yields  solutions  for  the  state  variables  at  the  «+l  time  step 
[61],  namely 

J5x  =  -Fix"^\u"*^)  (97.a) 
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where  J  is  the  Jacobian  matrix.  Note  that  the  predictor  (94.a)  can  be  used  as  an  initial  guess  for 
Xow  in  the  Newton-Raphson  algorithm. 

The  problem  of  solving  the  set  of  nonlinear  equations  (96)  is  the  same  as  that  for  solving  an 
unconstrained  minimization  problem  of  the  form 

g=^-F-F.  W 

From  a  geometrical  standpoint,  it  is  obvious  that  the  step  5x  should  lead  the  function  g  to  its 
descent  direction  at  each  iteration.  Due  to  the  quadratic  convergence  of  the  Newton  method,  it  is 
always  a  good  idea  to  try  the  full  Newton  step  first.  If  the  solution  is  not  close  enough  to  the 
minimum  ofg,  the  full  Newton  step  may  not  necessarily  decrease  the  function  [60, 61].  However, 
since  the  Newton  step  is  a  descent  direction  for  g,  it  is  guaranteed  that  an  acceptable  step  can  be 
found  for  reducing  the  function  by  backtracking  along  the  Newton  direction.  The  backtracking 
algorithm  is  based  on  line  searches.  The  solution  (97.b)  is  then  rewritten  as 

where />  =  It  is  not  sufficient  to  require  merely  that  g(x"^,'.)<g(x"y  )  since  this  criterion  can 

fail  to  converge  to  a  minimum  of  g  [61].  A  better  strategy  is  to  require  the  average  rate  of  decrease 
ofg  be  some  fraction  a  of  the  initial  rate  of  decrease  ^g-p-  Mathematically,  this  is  written  as  [60, 

61] 

Typically  a  good  choice  for  a  is  10"^  [61]. 

Next,  the  backtraeking  algorithm  must  be  developed.  The  strategy  for  a  useful  and  practical 
backtracking  routine  is  to  define  the  function  [60, 61] 

u{X)  =  ^xlt^  +Xp)  (101) 

so  that 

M'(A)  =  Vg-/7.  (102) 

The  function  m(A)  is  modeled  with  the  most  current  information  if  a  backtrack  is  needed  and  A  is 
chosen  to  minimize  the  model  [61].  As  mentioned  earlier,  the  first  step  is  always  the  full  Newton 
step  A  =  1.  If  this  step  does  not  meet  the  criterion  in  (100),  a  quadratic  model  is  used  to  compute 
the  appropriate  value  for  A.  That  is  [60]: 

m(A)*[m(1)-m(0)-m'(0)]a2+m'(0)A+w(0).  (103) 

Notice  that  m(0)  and  m'(0)  are  known  and  m(1)  has  just  been  computed.  Taking  the  derivative  o^f 
(103)  and  setting  the  derivative  to  zero  yields  the  minimum  point  for  (103)  [60] 

;  wm 


A=- 


2[m(1)-m(0)-m'(0)]‘ 


(104) 
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If  (104)  still  does  not  satisfy  (100),  subsequent  backtracks  will  be  needed.  For  the  subsequent 
backtracks,  m(A)  is  modeled  as  a  cubic  function  [60] 

«cubic(^)  =  «^^+^^^+«'(0)A+M(0)^  (105) 

where 

’m(Aj)-m'(0)A,  -«(0) 
m(A2)-m'(0)A2-m(0)_ 


a 

_  ■  r 

_b_ 

_ 1 

1 

II 

1/Af 


-\IX\ 


-A2/AJ  Ai/AlJ 


(106) 


Ai  and  A2  are  the  most  recent  previous  two  values  for  A.  The  minimum  point  for  the  cubic  model 
in  (105)  is  [60] 

»  9  » /  f\\ 

(107) 


A  = 


-b  +  ^Ja  -  3au'{0) 


3a 


To  avoid  extremely  small  steps,  a  lower  bound  of  0.1  and  an  upper  bound  of  0.5  should  be  set  for 
A  [60]. 

4.  Linear  Loads  ^ 

The  state  variable  equations  can  be  easily  derived  for  linear  loads.  For  example,  consider  single 
lumped  element  terminations  such  as  a  resistor  R,  a  capacitor  C,  or  inductor  L.  The  state  variable 

equations  would  hence  be: 

Linear  Resistor: 

(I08.a) 


1 


1 


Vc  =- 


-vcp^-pr^tot 


RC,,  Cp 


Linear  Inductor: 


r .  1 

[o  --LI 

r  n 

■  1  ■ 

1 

= 

Cf 

-  0 

IL  J 

+ 

Cf 

0 

Uot 


Linear  Capacitor: 
C+Cr^'°‘' 


(108.b) 


(108.C) 


It  is  noted  that  for  a  network  of  linear  lumped  elements,  state  variable  equations  could  be  similarly 
derived.  ' 

A  Linear  Amplifier 

A  more  interesting  example  of  a  linear  load  would  be  a  linear  amplifier.  Specifically,  consider 
the  small  signal  analysis  of  a  GaAs  MESFET  amplifier.  The  equivalent  circuit  for  this  device  is 
illustrated  in  Fig.  21.  The  state  equation  in  (90)  was  used,  with  the  exception  that  6  is  now  a 

matrix  B.  To  this  end,  from  the  circuit  in  Figs.  20  and  2 1  it  can  be  shown  that: 

^T 


X  =  lv 


Cf 


DF 


ii  ii 


(109.a) 
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A  = 


■  0 

0 

0 

0 

0 

”1  !  ^GF 

0 

0 

-1  /  RiCg, 

1  /  R/Cg, 

MRiCg, 

0 

0 
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0 

1  /  R  '^Cg, 

-MR^Cg, 

-MR^Cg, 

0 

^ICgd 

0 

0 

-l//?,-Q, 

“1  ^  ^idx^dx 

0 

1/Qv 

-1/Q.  ’ 

0 

0 

0 

0 

0 

0 

1  /  C[)p 

^dx  ! 

0 

-^dx !  h 

-Ld  /  Lj 

-LxILr 

-L\gl  Lj 

0 

-4  / 

Lg  /  Lf 

-LgJLj 

Lid  ! 

~^2d  ^ 

(109.b) 

r  1  /  ^GF 

0  1 

1 

B  = 


0 

0 

0 

0 

0 

0 


0 

0 

0 

1  /  Cjjf 
0 
0 


u(t)  = 


^ Drain^^\ 


(109.C) 


Cgf  Cof  are  the  net  cell  capacitances  at  the  gate  port  and  the  drain  port,  respectively. 
Specifically,  assuming  that  the  drain  and  gate  ports  are  distributed  over  a  number  of  cells,  these 
represent  the  total  networked  cell  capacitance,  with  each  computed  as  in  (89).  The  quantities  in  the 
excitation  vector  Icate  and  iDrain  are  the  net  equivalent  current  sources  computed  by  the  line 
integrations  of  the  magnetic  fields  about  the  gate  port  and  the  drain  port,  respectively.  Some  of  the 
other  elements  in  A  are  defined  as: 

Lj^=Lj  +  L^,  Lg^=Lg  +  L^,  ~  ^i^ds  ^ 

^2g  -  K^d  ~  ^d^s^ 


L\d  ~  ~  ^g^s->  ^2d  ~  ^d^g  ^d^s  ^s^g'< 

Lt  ~  ^g^d  ^g^s  ^d^x 


(110) 


-r  ""  ^g^d  ^g^s  ^d^s  ■ 

As  an  example,  a  GaAs  MESFET  was  simulated  using  the  FDTD  algorithm.  The  geometry  of 
the  device  is  illustrated  in  Fig.  22.  The  amplifier  was  matched  at  6  GHz.  For  this  example,  the 

circuit  parameters  were  chosen  yielding  S-parameters  at  6  GHz: 

S,,  =0.636Z- 171.3°,  5,2  =0.069Z16.3°,  52,  =  2.061Z28.63°,  522  =  0.524Z - 95.73° 

For  these  S-parameters,  the  transistor  is  unconditionally  stable  and  a  simultaneous  conjugate 
match  is  possible.  Balanced  shunt  stubs  and  series  lines  are  used  to  design  the  input  and  output 
matching  networks.  The  maximum  transducer  gain  for  the  amplifier  is  1 1.137  dB.  The  layout  is 
illustrated  in  Fig.  22.  The  gate  and  drain  ports  of  the  transistor  are  replaced  by  equivalent  current 
sources,  as  illustrated  in  Fig.  20.  These  circuits  are  loaded  with  the  small  signal  equivalent  model 
in  Fig.  21.  The  ports  actually  span  the  entire  width  of  the  strip.  Thus,  Icate  and  I Drain  represent 
the  total  current  entering  the  gate  and  drain  ports.  This  can  be  done  by  adding  the  currents  flowing 
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along  each  edge  of  the  port,  as  calculated  by  the  line  integrals  of  the  magnetic  field,  or  by 
calculating  the  line  integral  about  the  entire  port.  Similarly,  the  gate  and  drain  capacitance  are 
calculated  as  the  total  parallel  capacitance  across  the  port  width.  The  source  pads  are  terminated  to 
ground  by  a  via. 

Equation  (91)  was  used  to  solve  for  the  state  model  of  the  small  signal  circuit.  The  device  was 
excited  by  a  broad  band  source  and  the  S-parameters  were  then  extracted  over  a  broad  frequency 
range.  These  results  are  illustrated  in  Fig.  23.  The  results  are  also  overlaid  with  the  computation 
done  using  an  empirical  based  analysis  using  HP-EESOF™,  which  modeled  both  the  linear 
MESFET  amplifier  with  the  same  circuit  parameters  and  the  microstrip  circuit.  The  results 
compare  amazingly  well. 


5.  Nonlinear  Loads 

To  demonstrate  the  nonlinear  solution  of  the  state- variable  equations,  two  examples  will  be 
illustrated.  A  nonlinear  diode  and  a  nonlinear  MESFET  amplifier.  The  equivalent  circuit  used  for 
the  nonlinear  diode  is  illustrated  in  Fig.  24  [59].  The  governing  equations  for  this  model  are: 


r  -  D 

V  ^ 

V=-^Qp} 

qv. 


(Ip 


C,r  = 


InMkJF 


(111) 


state  equation  derived  from  the  circuit  in  Fig.  24  is  then: 
1  .  1 


C.  +  C, 


pack 


Cf  +  Cpack 


hot' 


'dp 


-InMkJFiv,  -  -\)  +  IMJFiv,  - 


-  ■ 


(v,  -  - 1)  +  D2nMki,TF 


(112.a) 


(112.b) 


■'‘pack 


•vc^ 


-^pack 


Rb  • 

Lpack"'^' 


(112.C) 


This  set  of  nonlinear  differential  equations  is  solved  using  the  algorithm  described  in  Section  3. 


,As  a  simple  example,  a  50  Q  microstrip  line  was  terminated  by  a  nonlinear  diode.  The 
dimensions  of  the  uniform  microstrip  line  are  identical  to  that  feeding  the  MESFET  amplifier 
example.  The  nonlinear  model  illustrated  in  Fig.  24  was  used  with  tol  =  10-4,  ^  =  io-l2, 
Io  =  1  pA,  =  0.7  V,  r  =  298  K, ;?  =  0.5,  M  =  1.179,  F  =  1  GHz,  Cf  =  5.9976  X  lO'l^  F, 
Lpack  =10  pH,  Cpack  =  0.1  pF,  and  =  0.1  Q.  The  results  of  the  FDTD  simulation  are 
illustrated  in  Fig.  25.  A  2  V  sinusoidal  voltage  pulse  with  a  frequency  of  10  GHz  is  used  to  excite 
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the  microstrip  line.  The  amplitude  of  the  pulse  is  ramped  up  using  a  half-Gaussian  pulse  to  avoid 
high-frequency  transients.  It  is  observed  from  Fig.  25  that  the  voltage  across  the  diode  is  clamped 
at  about  0.7  V, 

The  final  example  is  that  of  a  nonlinear  MESFET  amplifier.  The  geometry  of  the  amplifier  is 
that  of  the  linear  amplifier  presented  in  Fig.  22.  The  equivalent  circuit  used  is  also  that  of  Fig.  21. 
However,  the  voltage-controlled  current  source  Ids  and  the  gate  to  source  capacitor  Cgs  are  now 
treated  as  nonlinear  terms.  Specifically,  [58] 


>  vq,  )  =  (4  +  A^C'S'  +  +  ^3Vc'S')tanh(avc^^ ), 


hi 


(113.a) 

(113.b) 


The  parameters  used  in  (1 13.a)  and  (1 13.b)  for  the  example  illustrated  below  were  the  same  as 
those  presented  in  [58]. 

The  approach  used  in  [58]  is  based  on  the  voltage-source  formulatio|i.  The  approach  used  here 
is  based  on  the  current-source  formulation.  This  has  proven  to  be  a  more  convenient  approach  for 
implementation,  and  the  same  level  of  accuracy  is  achieved.  Then,  from  Fig.  22,  the  state 


equations  are  derived  as: 
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where  several  of  the  quantities  are  defined  in  (110).  The  Jacobian  matrix  7  defined  in  (97.a)  is 

computed  using  the  forward-finite-difference  method  [60]. 

A  simulation  of  a  nonlinear  transistor  or  amplifier  such  as  the  MESFET  using  the  FDTD  or  PGY 
methods  would  proceed  as  follows:  1 )  Set  up  the  DC  bias  by  directly  applying  DC  sources  at  the 
gate  port  and  the  drain  port.  Note  that  the  DC  sources  are  ramped  up  using  a  half-Gaussian  pulse 
and  the  FDTD  simulation  is  run  until  a  stead  state  response  is  achieved.  2)  The  AC  source  is 
superimposed  over  the  DC  source  once  a  DC  steady  state  is  reached.  3)  The  total  voltages  at  the 


49 


S.  Geclney.  Risorous  Analysis  of  Large  Scale  MMIC  Circuit  Devices 


4/98 


input  and  output  ports  (DC+AC)  are  calculated.  4)  Repeat  steps  1-3  but  without  turning  on  the 
steady  state  response.  5)  The  AC  responses  are  obtained  by  subtracting  the  time  responses  in  step 
4  from  those  computed  in  steps  1-3. 

A  nonlinear  MESFET  amplifier  was  simulated  using  the  FDTD  method.  The  example  is  similar 
to  that  presented  in  [58].  The  same  mesh  dimensions  used  for  the  linear  amplifier  (Fig.  22)  were 
used  for  this  case.  A  10  cell  PML  was  used  on  all  walls  (except  for  the  ground  plane)  to  terminate 
the  mesh.  For  this  example,  the  DC  biasing  voltages  used  were  Fes  =  -0.81  V,  and  ^£>5  =  6.4  V. 
The  AC  source  signal  was  a  modulated  Gaussian  pulse  with  center  frequency  =  6  GHz.  The 
amplitude  of  the  source  was  quite  small  so  that  the  device  was  operating  in  a  linear  region.  Thus, 
broadband  information  was  extracted.  The  results  of  the  simulation  are  illustrated  in  Fig.  26.  It  is 
observed  that  the  device  is  matched  at  6  GHz  and  has  a  gain  of  12  dB  at  this  frequency. 
Comparing  these  results  to  those  published  in  [58]  demonstrates  excellent  agreement. 


Aot 


Equivalent  Device  Model 


Fig.  20  Equivalent  circuit  for  the  current-source  state  variable  model. 
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Fig.  22  The  layout  for  the  MESFET  amplifier.  The  microstrips  have  50  characteristic 
impedances  and  are  printed  on  a  3 1  mil  substrate  with  er  =  2.23.  The  FDTD  model  had 
a  spatial  discretization  of  A,v  =  Ay  —  0.3000875  mm,  Az  —  0.15748  mm,  and  a  time 
step  of  A/  =  4.01 1  X  10"*^  s  was  used.  This  lead  to  8  cells  across  the  microstrip  width 
and  5  cells  through  the  substrate. 
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Frequency  in  GHz 

Fig.  23  The  magnitude  of  the  S-parameters  of  the  linear  MESFET  amplifier  computed  by  FDTD 

with  the  state-variable  model  and  compared  by  computations  using  HP-EESOF"''^'. 


Fig.  24  Equivalent  circuit  for  the  diode  model.  Both  Cpack  and  Lpack  are  introduced  to  model 
parasitic  packaging  affects. 
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Fig.  25  Voltage  across  the  nonlinear  diode  terminating  a  50  ohm  microstrip  line. 


) 


Fig.  26  The  magnitude  of  the  S-parameters  computed  by  the  FDTD  simulation  for  the 
nonlinear  MESFET  amplifier. 
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